BuildKit caches its build artifacts to a local directory, but it’s not always obvious where that directory is or how to manage it.

Let’s see it in action. Imagine you’ve got a Dockerfile like this:

FROM ubuntu:latest
RUN apt-get update && apt-get install -y --no-install-recommends some-package
RUN echo "hello world" > /app/hello.txt
CMD ["cat", "/app/hello.txt"]

When you build this with docker buildx build . --tag my-app, BuildKit first downloads the ubuntu:latest image. If you build it again, it won’t re-download ubuntu:latest because that layer is cached. Then, it runs apt-get update and installs some-package. Again, if you rebuild, BuildKit checks if the result of this RUN command (the state of the filesystem after installation) is already in its cache. If it is, it uses that cached layer instead of running the command again. The same applies to the echo command.

The core idea is that each instruction in your Dockerfile, or rather, the result of executing that instruction in a specific build context, can be cached. BuildKit uses content-addressable storage, meaning it generates a unique identifier (a content hash) for each build artifact. If a subsequent build uses the exact same artifact (same parent layer, same command, same context), it reuses the cached layer. This dramatically speeds up builds, especially for complex applications with many dependencies.

The problem is, where does all this cached data live? By default, BuildKit stores its cache in a directory managed by the Docker daemon. You can find this location by inspecting the Docker daemon’s configuration.

On Linux, this is typically in /var/lib/docker/buildkit/runc/cache or similar. On macOS or Windows, it’s within the Docker Desktop VM’s filesystem. You can also configure BuildKit to use a specific, user-managed directory.

To explicitly control the cache location, you can set the BUILDKIT_CACHE_DIR environment variable before starting the BuildKit daemon. For example, if you want to store your cache in /mnt/buildcache, you would start BuildKit like this:

# This is a simplified example; actual daemon startup varies by OS and setup
BUILDKIT_CACHE_DIR=/mnt/buildcache buildkitd

When BuildKit runs, it serializes build state and artifacts into this directory. Each cache entry is a blob of data, and the metadata links these blobs together to form build layers. When you run a docker buildx build command, BuildKit consults this cache directory to see if it has the necessary layers to satisfy the build steps. If it finds a match, it reuses the existing layer instead of executing the command.

The cache is organized internally using a key-value store, typically SQLite, to map build steps and their dependencies to the actual cached blobs. This allows for efficient lookups. The directory structure itself might look like a complex arrangement of hashes and metadata files, not something you’d typically navigate manually.

To clean up the cache, you can use docker buildx prune. This command removes unreferenced build cache entries. For example, docker buildx prune -a will remove all build cache. If you’ve manually specified a BUILDKIT_CACHE_DIR, you would simply delete the contents of that directory.

The most surprising true thing about BuildKit’s cache is that it’s not just about layer reuse based on Dockerfile instructions. BuildKit also caches the results of buildkitd’s internal operations, like downloading dependencies or compiling code, even if those aren’t directly represented as distinct Docker image layers. This means that if you change a file outside the build context that isn’t explicitly copied into the image, but is used by a build step (e.g., a Go build cache for a dependency), BuildKit might still be able to reuse a cached result if the build context hash remains the same.

Understanding the cache location and management is crucial for optimizing build performance and managing disk space.

Want structured learning?

Take the full Buildkit course →