BuildKit, Docker’s modern builder, can store its cache in cloud object storage like S3 or GCS, dramatically speeding up builds by allowing them to resume from previous states without re-downloading everything.
Let’s see it in action. Imagine you’re building a Go application.
First, you need a Dockerfile that leverages BuildKit’s cache mounts.
# syntax=docker/dockerfile:1
FROM golang:1.21-alpine AS builder
WORKDIR /app
# Mount the Go module cache
RUN --mount=type=cache,target=/go/pkg/mod \
go build -o /app/my-app .
# Mount the build output cache
RUN --mount=type=cache,target=/app/my-app \
cp /app/my-app /output/my-app
FROM alpine:latest
COPY --from=builder /output/my-app /my-app
CMD ["/my-app"]
The key here are the RUN --mount=type=cache,target=/go/pkg/mod and RUN --mount=type=cache,target=/app/my-app directives. These tell BuildKit to create a cache that will be persisted for the specified target directory.
Now, to make this cache persistent across build environments and available remotely, we’ll use BuildKit’s remote cache feature, specifically with S3 or GCS.
Here’s how you’d configure it for S3 using environment variables:
export DOCKER_BUILDKIT=1
export BUILDKIT_HOST=docker-container://buildkitd
export BUILDKIT_INSECURE=true # For local testing, use false with proper certs in production
# S3 Cache Configuration
export BUILDKIT_REMOTE_CACHE="s3://your-bucket-name/build-cache/"
export AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY"
export AWS_REGION="us-east-1" # Or your S3 bucket's region
docker build . -t my-go-app
And for GCS:
export DOCKER_BUILDKIT=1
export BUILDKIT_HOST=docker-container://buildkitd
export BUILDKIT_INSECURE=true
# GCS Cache Configuration
export BUILDKIT_REMOTE_CACHE="gcs://your-bucket-name/build-cache/"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"
docker build . -t my-go-app
When you run docker build with these environment variables, BuildKit will attempt to:
- Download Cache: Before starting a build, it checks the
BUILDKIT_REMOTE_CACHEfor existing cache layers relevant to your build. If found, it downloads them, speeding up subsequent steps. - Upload Cache: After a successful build, it uploads any newly created or modified cache layers to the specified S3 or GCS bucket.
This allows multiple CI/CD agents or developers to share the same build cache, significantly reducing build times, especially for projects with many dependencies or large base images. The target path in the --mount=type=cache directive is what BuildKit uses to identify and organize cache entries within the remote storage.
The most surprising thing about BuildKit’s remote cache is that it’s not just about storing the final output of a layer; it’s about granularly caching the intermediate results of build steps, particularly those defined with --mount=type=cache. This means that if a dependency like go.mod or a package installation step changes, only the subsequent steps that rely on that specific cache are re-executed. The rest of the build can still be pulled from the remote cache.
The BUILDKIT_INSECURE=true flag is a convenience for local testing. In production, you must configure proper TLS certificates for your BuildKit daemon and connect to it over HTTPS. For S3, this typically involves ensuring your AWS credentials have s3:GetObject and s3:PutObject permissions on the specified bucket. For GCS, your service account needs equivalent permissions.
The BUILDKIT_HOST=docker-container://buildkitd implies you’re running BuildKit as a separate container, which is the recommended way to manage BuildKit daemon instances and their configurations, especially when dealing with remote cache backends.
Once you have your remote cache working, you’ll naturally want to explore how to use multiple cache backends simultaneously, perhaps a local cache for speed and a remote one for sharing.