BuildKit is an extensible, declarative, and secure builder for Docker images. In GitLab CI/CD, it can significantly speed up your builds and reduce image sizes.

Here’s a GitLab CI/CD job that uses BuildKit:

build_image:
  stage: build
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  variables:
    DOCKER_BUILDKIT: 1
    DOCKER_TLS_CERTDIR: "/certs"
  script:
    - echo "Building Docker image with BuildKit..."
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
    - docker build --tag "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA" .
    - docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA"
    - echo "Build complete."

This setup leverages the docker:dind service (Docker-in-Docker) to run a Docker daemon within your CI job. The DOCKER_BUILDKIT: 1 environment variable is the key to enabling BuildKit. When this is set, the docker build command will automatically use the BuildKit backend instead of the older, monolithic builder.

The DOCKER_TLS_CERTDIR: "/certs" variable is necessary for the Docker-in-Docker service to establish a secure connection between the client and the daemon. The docker login command authenticates with your GitLab Container Registry, and then docker build creates an image tagged with the commit SHA. Finally, docker push uploads it to the registry.

BuildKit’s speed comes from its parallel execution of build steps and its advanced caching mechanisms. Unlike the traditional builder, which executes Dockerfile instructions serially, BuildKit analyzes the entire Dockerfile upfront. It identifies independent stages and commands that can be run concurrently, dramatically reducing build times, especially for complex Dockerfiles with many layers.

One of the most powerful features of BuildKit is its content-addressable cache. Every build operation, including file copies, package installations, and even shell commands, is identified by a content hash. If the inputs to an operation haven’t changed since the last build, BuildKit can reuse the cached result without re-executing the step. This is a significant improvement over the older builder’s layer-based caching, which could be invalidated by seemingly minor changes.

Consider a Dockerfile where you first install dependencies and then copy your application code. With the old builder, if you changed your application code, the dependency installation step would still be re-run, invalidating its cache. BuildKit, however, recognizes that the dependency installation step is independent of the application code copy and can be cached separately. If the dependencies haven’t changed, BuildKit will reuse that cached layer, even if the application code has been modified. This granular caching is a game-changer for CI/CD pipelines.

The docker build --tag "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA" . command is where the magic happens. When DOCKER_BUILDKIT=1 is set, this command internally invokes the BuildKit solver. The solver takes your Dockerfile, analyzes its execution graph, and determines the most efficient way to build the image, leveraging parallel execution and its content-addressable cache.

Furthermore, BuildKit supports features like multi-platform builds with docker buildx (though not explicitly shown in this basic example, it’s a common extension) and better error reporting. When a build fails, BuildKit provides more detailed output, often pinpointing the exact instruction and the reason for failure, making debugging much easier.

The docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA" command uploads the built image. If you were to use docker buildx with BuildKit, you could push multi-architecture images to the registry.

When you have BuildKit enabled and your Dockerfile is optimized for it, you’ll notice a significant reduction in the time it takes for your CI jobs to complete. This is primarily due to the parallel execution of build stages and the highly effective caching. Instead of waiting for each layer to build sequentially, BuildKit can orchestrate multiple build steps simultaneously.

A subtle but important aspect of BuildKit’s caching is that it doesn’t just cache based on the layer content itself, but also on the inputs to each build step. For example, if you have a RUN apt-get update && apt-get install -y some-package command, BuildKit will hash the entire command string and any files it might read. If you were to change some-package to another-package, the cache for that step would be invalidated. However, if you only changed your application’s source code (which is copied after the package installation), the cache for the apt-get command would remain valid, and that step wouldn’t need to be re-executed. This granular dependency tracking is a core strength.

The next thing you’ll likely want to explore is using docker buildx in conjunction with BuildKit to build multi-platform Docker images for different architectures (like amd64 and arm64) within your GitLab CI/CD pipeline.

Want structured learning?

Take the full Buildkit course →