Ccache is my favourite build system
To build C and C++ projects, you usually set up a build system like Make or CMake. Typically, these track dependencies between source files and compilation targets to be able to rebuild only targets whose transitive dependencies have been modified, based on modification time, thereby speeding up compilation over just compiling everything from scratch. In practice, you still perform some duplicate compilations, e.g. when doing a clean build, when building different versions of the same software, etc.
Ccache solves this problem by wrapping the compiler and caching its outputs.
When ccache detects that the same input would be compiled again, all compiler output is taken from the cache instead.
“But ccache is not a build system” I hear you say.
The pitch
But if ccache is already taking care of recompiling only the necessary files, can’t we ditch the aforementioned build tools entirely? Yes and the result is my absolute favourite build system:
(find -name '*.c' | xargs --max-args=1 ccache gcc -c) && gcc *.o -o main
This bash script uses xargs to invoke ccache for each source file in parallel.
At the end gcc is invoked to link all the object files together, which cannot be cached by ccache.
If these files are compiled for the first time, they are all compiled in parallel.
If you run the script again, all object files are quickly pulled from cache (in parallel).
If you change a single .c file, only that one is recompiled.
The rest is pulled from cache.
If you change a single header, only the sources including it will be recompiled.
Unfortunately, ccache does not cache anything when multiple files are passed to one command1, so the even simpler ccache gcc *.c -o main does not work as intended.
Is this any good?
I really like it. I’ve been using it for all my smaller projects, because it is so simple. The question “How do I add this compile flag?” is easily answered.
Upsides:
- Extremely simple
- Easily edit compilation flags
- Easily edit build system
- Surprisingly performant
- incremental build
- fully parallel build
- cached build
- ✨clean✨ build
Downsides:
- Not easily portable
- requires bash, or similar
- depends on compiler CLI
- hard coding libraries, flags
- linker invoked every time
- presumably does not scale to large code bases due to launching multiple processes for each translation unit on every single build
Obviously, a traditional build system provides many more features than xargs + ccache, such as the ability to define compilation flags in a portable way, find libraries in your system, and declare complex dependency graphs, just to name a few. They should also perform strictly better when using ccache as well, due them only invoking ccache for the needed files.
For small projects though, ccache is fast enough.
To give you a rough idea, here are some basic hyperfine benchmarks based on my randomwalk project with 8 files.
# compile, link, full build, cmake build, cmake configure
hyperfine --warmup=10 --min-benchmarking-time=10 'xargs --max-args=1 --max-procs=50 ccache clang++ -Wall -std=c++26 -O2 -g -Ivendor/imgui -Ivendor/imgui/backends $(pkg-config --cflags sdl3) -c <sources.txt' 'clang++ -Wall -std=c++26 -O2 -g -fuse-ld=mold $(pkg-config --libs --cflags sdl3) -o main ./*.o' ./build.sh 'cmake --build build -j' 'gio trash build; mkdir -p build && pushd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_SHARED_LINKER_FLAGS="-fuse-ld=mold" -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=mold" -DCMAKE_VERBOSE_MAKEFILE=ON; popd'
command,mean,stddev,median,user,system,min,max
xargs --max-args=1 --max-procs=50 ccache clang++ -Wall -std=c++26 -O2 -g -Ivendor/imgui -Ivendor/imgui/backends $(pkg-config --cflags sdl3) -c <sources.txt,0.006066601464426004,0.00019500747852830948,0.006052739860000001,0.0111329819363762,0.018234544204702628,0.005603244360000001,0.00685912636
clang++ -Wall -std=c++26 -O2 -g -fuse-ld=mold $(pkg-config --libs --cflags sdl3) -o main ./*.o,0.027786096779540224,0.0005768461943693684,0.027717892860000003,0.007775711034482753,0.006169721149425283,0.026545809360000002,0.031001498360000004
./build.sh,0.04292396277176469,0.0008201344898828762,0.042812791360000005,0.019651706470588233,0.02745959843137255,0.041018994360000005,0.048181378360000006
cmake --build build -j,0.039590598263999986,0.000667556575374093,0.03950835536,0.020420704,0.018985583999999993,0.037859186360000005,0.04222505436000001
"gio trash build; mkdir -p build && pushd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_SHARED_LINKER_FLAGS=""-fuse-ld=mold"" -DCMAKE_EXE_LINKER_FLAGS=""-fuse-ld=mold"" -DCMAKE_VERBOSE_MAKEFILE=ON; popd",0.259492938991579,0.002473031131694414,0.25948769986,0.14207899052631573,0.10164378315789473,0.24995149636000003,0.26429419836
In my opinion, using only ccache delivers pretty compelling performance for how simple and flexible it is.
Prior art
The idea of caching compilation results is certainly not new.
Zig’s compiler, as a recent example, provides an even simpler experience.
You can simply pass all sources to zig as in zig cc a.c b.c and it will cache the compilation of all individual sources.
For simple enough projects, this obviates the need for a Makefile or other build system.
zig cc: a Powerful Drop-In Replacement for GCC/ClangThis is great! ccache is just slightly more compatible in my experience, especially with C++ compilers.
Further recommendations
use a fast linker
Incremental builds require linking anew.
With the build script given above, even if nothing changed, the linker is invoked.
Therefore, linking speed is critical for a quick build cycle.
I am a happy user of mold.
In GCC and Clang you can set it with -fuse-ld=mold.
limit parallelism
I personally like not bothering with this in the spirit of: declare your dependencies (none) and let the scheduler do the rest.
Still, it probably makes sense to limit xargs’s parallelism with eg. --max-procs=50 to avoid excessive context switching when compiling larger projects.
I would still recommend keeping it quite a bit higher than your core count. Firstly, on cache-hits, ccache performs plenty of non-CPU work, so it can benefit from having more processes than cores. Secondly, even when actually compiling, having more processes can finish the build sooner, because tasks are completed more evenly.
max-procs=2a.cc.cb.cmax-procs=3a.cb.cb.cc.ca.c--max-procs) are allowed. With max-procs=2 this will take 2 s. With max-procs=3, the build can finish in ~1.5 s, disregarding context-switching overhead.A more integrated solution could be smarter about how much parallelism to employ for cache retrieval and compilation respectively.
use a copy-on-write compressed filesystem
This is my recommendation in general.
Similar to other cache implementations, ccache has the ability to use a filesystem’s native copy-on-write to duplicate a file instantly without actually duplicating any data on disk.
Ccache can also compress cached files to save space, though not when using copy-on-write, since ccache needs to provide the uncompressed content.
To not prevent compression, copy-on-write is disabled by default in ccache.
Enable it by putting the following in your ~/.config/ccache/ccache.conf:
file_clone = true
This disables ccache’s compression, but your filesystem can still compress the files transparently. These two filesystem features work well in conjunction to improve the speed and storage requirements of ccache. Btrfs has been working well or me.
throw it into build.sh
I usually create at least one build.sh file in each project directory where I save the compile command.
The input files and compile flags vary depending on the project.
The example above is stripped down for clarity. A more realistic example looks like this:
#!/bin/bash
(find . -name '*.c' -print0 | xargs -0 --max-args=1 --max-procs=50 ccache gcc -Wall -O2 -g -c) && gcc -fuse-ld=mold ./*.o -o main
rm ./*.o
Closing thoughts
I love the simplicity of the xargs + ccache build system. Ccache is powerful enough to avoid compilation in the majority of cases where it is possible. A build script is easy to maintain and it is trivial to modify your compile command.
It would be great to see ccache add support for multiple files to make this build system even simpler.
Integrated multiple file support would even allow optimizing the parallelism issue hinted at above.
I also considered implementing this as a small CLI atop ccache, with the same interface, that automatically separates files from flags.
However, this would be a tough sell, considering you can simply use xargs instead.
I hope you enjoyed this blog post and give ccache a try :) Thanks a lot to Lars Quentin, Hossein Biniaz, and my other reviewers.