Ephemeral cache layers are the secret sauce that makes modern container builds lightning-fast, but they’re often misunderstood as just another layer in your Dockerfile.
Let’s watch this in action. Imagine you have a simple Go application.
# syntax=docker/dockerfile:1
FROM golang:1.20 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/my-app .
FROM alpine:latest
COPY --from=builder /app/my-app /my-app
ENTRYPOINT ["/my-app"]
When BuildKit builds this, it doesn’t just blindly execute commands. It has a sophisticated understanding of what each command does and what it depends on.
Here’s how it breaks down:
FROM golang:1.20 AS builder: This pulls the base image. BuildKit caches this image itself.WORKDIR /app: This changes the directory. It’s a metadata operation, very fast.COPY go.mod go.sum ./: BuildKit calculates a checksum ofgo.modandgo.sum. If these files haven’t changed since the last build, and the parent layer (thegolang:1.20image) is the same, BuildKit knows it can reuse the result of the next command.RUN go mod download: This is where ephemeral cache layers shine. BuildKit sees thatgo mod downloadonly depends on thego.modandgo.sumfiles (from the previousCOPY) and the base image. If those inputs haven’t changed, BuildKit doesn’t actually rungo mod downloadagain. It retrieves the output of this command from its cache. This output is the downloaded Go modules, ready to be used. This is an ephemeral cache layer – it’s the result of aRUNcommand, but it’s not a new filesystem layer that’s permanently committed to the image unless explicitly requested.COPY . .: BuildKit now calculates a checksum of all files in your current directory. If these files haven’t changed, and thego mod downloadcache layer is still valid, BuildKit knows the source code hasn’t changed.RUN CGO_ENABLED=0 GOOS=linux go build -o /app/my-app .: This command depends on the downloaded modules (from the cachedgo mod download) and the source code (from theCOPY . .). If neither has changed, BuildKit reuses the cached result of the build. This is another ephemeral cache layer.FROM alpine:latest: Pulls the final base image.COPY --from=builder /app/my-app /my-app: ThisCOPYinstruction is special. BuildKit knows it’s copying from a previous build stage (builder). It efficiently transfers the compiled binary (/app/my-app) from thebuilderstage’s cached artifact to the final image. This is not a filesystem layer copy; it’s a direct artifact transfer.ENTRYPOINT ["/my-app"]: Another metadata operation.
The key is that BuildKit doesn’t just store full filesystem layers for every RUN command. It stores artifacts (like downloaded modules, compiled binaries, or installed packages) and the dependencies that produced them. When a RUN command’s inputs (files, parent layers, environment variables) haven’t changed, BuildKit retrieves the cached artifact instead of executing the command. This is an ephemeral cache layer – it’s a cached result, not necessarily a new, permanent filesystem layer.
This allows BuildKit to skip entire stages or commands if their inputs are identical to a previous build. The "ephemeral" part refers to the fact that these cached results aren’t always materialized as distinct filesystem layers in the final image unless they are the final output of a stage. They are more like intelligent shortcuts.
When you run docker buildx build --cache-from type=registry,ref=your-registry/your-image:cache --cache-to type=registry,ref=your-registry/your-image:cache ., you’re telling BuildKit to look for these cached artifacts in a remote registry and to push newly generated ones there. The type=registry is crucial here, as it tells BuildKit to use the OCI image manifest and blob storage as its cache backend. BuildKit leverages the OCI specification to store and retrieve these intermediate build artifacts efficiently.
The surprising thing about ephemeral cache layers is how they allow BuildKit to reason about build dependencies at a granular level, far beyond simple file checksums. It understands the semantics of commands like go mod download or npm install and can invalidate cache based on specific inputs (like go.mod or package-lock.json) rather than just a generic file change.
The next step is understanding how to explicitly control these cache layers using the docker/dockerfile:1 syntax and the --mount=type=cache option for more complex caching scenarios beyond just command results.