BuildKit can run Dockerfile stages in parallel by intelligently identifying independent build stages and executing them concurrently, significantly reducing build times.

Here’s how it works in practice. Imagine a Dockerfile that builds a Go application and then copies the binary into a lean alpine image:

# Stage 1: Build the Go application
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .

# Stage 2: Create the final image
FROM alpine:latest
COPY --from=builder /app/myapp /usr/local/bin/myapp
CMD ["myapp"]

Without BuildKit or with default settings, these stages would run sequentially: first builder, then the final alpine stage. BuildKit, however, recognizes that the builder stage is independent of the alpine stage until the COPY --from=builder instruction. If you had another completely separate stage, say building a frontend asset with Node.js, BuildKit could potentially start building that at the same time as the Go application, provided there are no dependencies between them.

Let’s add another stage to illustrate:

# Stage 1: Build the Go application
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .

# Stage 2: Build frontend assets
FROM node:18-alpine AS frontend
WORKDIR /app/frontend
COPY frontend/package.json frontend/package-lock.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build

# Stage 3: Create the final image
FROM alpine:latest
COPY --from=builder /app/myapp /usr/local/bin/myapp
COPY --from=frontend /app/frontend/dist /var/www/html
CMD ["myapp"]

When you build this with BuildKit enabled (which is the default for recent Docker versions), BuildKit analyzes the dependencies. It sees that builder and frontend have no direct dependencies on each other. The COPY --from=builder and COPY --from=frontend instructions in Stage 3 are the only links. BuildKit can therefore initiate the RUN go build in builder and the npm ci and npm run build in frontend concurrently. The final stage (alpine) will only start once all its required source stages (builder and frontend) have completed.

The key to unlocking this parallelism lies in how BuildKit constructs its execution graph. It breaks down the Dockerfile into a series of build steps (e.g., FROM, RUN, COPY, ADD). For each step, it determines its dependencies on previous steps. If a step and its prerequisites do not depend on the output of another step that is still running, BuildKit can schedule them in parallel. This dependency analysis is sophisticated; it understands that COPY --from=stage_name refers to the entire output of a previous stage, not just a single instruction.

To ensure BuildKit is active, you can explicitly enable it in your Docker daemon configuration (/etc/docker/daemon.json on Linux):

{
  "features": {
    "buildkit": true
  }
}

Then restart the Docker daemon: sudo systemctl restart docker.

Alternatively, you can use the DOCKER_BUILDKIT=1 environment variable before running docker build:

DOCKER_BUILDKIT=1 docker build -t my-app .

The actual parallelism you observe depends on the nature of your build steps. CPU-bound tasks (like compiling code) and I/O-bound tasks (like downloading dependencies) can often be run in parallel effectively. If all your stages are heavily I/O-bound and waiting for external resources, the gains might be less pronounced, but BuildKit will still manage the dependencies correctly.

The most surprising thing about BuildKit’s parallel execution is how it handles build cache. When stages are run in parallel, BuildKit still ensures that cache is utilized effectively. If a parallel stage can be fully satisfied by existing cache, it will be completed instantly, potentially unblocking later stages sooner. It doesn’t just run things side-by-side; it intelligently orchestrates the entire build graph, respecting cache and dependencies.

The next step in optimizing Docker builds with BuildKit involves leveraging its advanced caching mechanisms, such as remote cache and cache mounts.

Want structured learning?

Take the full Buildkit course →