BuildKit’s garbage collection is a surprisingly aggressive beast that can reclaim disk space faster than you might expect, but it’s also configurable to prevent accidental data loss.

Let’s see it in action. Imagine you’re building a Docker image.

# Dockerfile
FROM alpine:latest
RUN echo "hello world" > /hello.txt
CMD ["cat", "/hello.txt"]

When you build this, BuildKit creates layers, but also intermediate build cache.

docker build -t my-alpine-app .

BuildKit stores these layers and cache in a directory, typically /var/lib/docker/buildkit/. If you don’t manage this space, it can grow.

To understand what BuildKit is keeping, we need to look at its garbage collection settings. The primary mechanism is controlled by the buildkitd.toml configuration file. This file dictates how long BuildKit keeps unreferenced build cache and when it performs a garbage collection.

The key parameters are:

  • gc.default.keep_storage: This defines the minimum amount of storage (in bytes) that BuildKit will retain, even if it’s unreferenced. This is a safety net.

    • Diagnosis: Check your buildkitd.toml file for the gc.default.keep_storage setting. If it’s not present, it defaults to a relatively small value, often around 10GB.
    • Fix: To ensure you always have at least 50GB of cache available, even if unreferenced, add or modify the line:
      gc.default.keep_storage = 53687091200 # 50GB in bytes
      
    • Why it works: This tells BuildKit’s garbage collector to leave at least this much data on disk, preventing it from aggressively cleaning up everything if your cache usage dips below this threshold.
  • gc.default.period: This specifies how often BuildKit should run its garbage collection process (in seconds).

    • Diagnosis: Again, inspect buildkitkitd.toml for gc.default.period. If it’s missing, BuildKit might run GC only on demand or at a very infrequent default.
    • Fix: To have BuildKit attempt to run garbage collection every 24 hours, set:
      gc.default.period = 86400 # 24 hours in seconds
      
    • Why it works: This schedules the garbage collection process to run periodically, ensuring that unreferenced build cache is regularly identified and removed according to the other GC rules.
  • gc.default.ttl: This is the Time To Live for unreferenced build cache. It dictates how long BuildKit will keep cache that is no longer associated with an active build or a retained build record.

    • Diagnosis: Look for gc.default.ttl in your buildkitd.toml. A missing value means unreferenced cache might be kept indefinitely until a manual GC is triggered or keep_storage is hit.
    • Fix: To automatically clean up unreferenced cache that hasn’t been accessed in 7 days, configure:
      gc.default.ttl = 604800 # 7 days in seconds
      
    • Why it works: This automatically prunes old, unused cache entries, freeing up disk space by removing data that’s unlikely to be needed again.
  • gc.default.all: A boolean flag. If set to true, BuildKit will attempt to garbage collect all unreferenced data, even if it’s within the keep_storage limit. This is a more aggressive cleanup.

    • Diagnosis: Check for gc.default.all in buildkitd.toml. It’s often false by default.
    • Fix: If you need to reclaim as much space as possible immediately, temporarily set:
      gc.default.all = true
      
    • Why it works: This overrides the keep_storage minimum for a single GC run, ensuring that if you have a large amount of unreferenced cache, it will be cleaned up aggressively. Use with caution.
  • Manual Garbage Collection: You can also trigger garbage collection manually.

    • Diagnosis: If you suspect cache is accumulating and GC isn’t running as expected, you can force it.

    • Fix: Run the following command:

      buildctl build --frontend dockerfile.v0 --local context=. --local dockerfile=. --export-cache type=registry,mode=max,uri=your-registry/cache:tag --output type=local,dest=/tmp/buildkit-gc
      

      Correction: The above is for exporting cache. The correct command to trigger GC is via the BuildKit daemon’s API, often through buildkit-cli-connector or a direct buildctl command that targets the daemon. A simpler way is to restart the BuildKit daemon, which often triggers a GC. For explicit control, you’d typically use buildctl prune.

      Actual Fix for Manual GC Trigger:

      buildctl prune
      

      This command directly tells the BuildKit daemon to run garbage collection.

    • Why it works: This bypasses the scheduled period and forces the GC process to run immediately, clearing out unreferenced data based on the configured ttl and keep_storage.

It’s also important to note that BuildKit’s garbage collection works on unreferenced build cache. This means cache that is still part of a build manifest or is explicitly retained by BuildKit (e.g., via docker build --keep-cache or specific BuildKit features) will not be removed by the automatic GC. The ttl parameter is crucial for ensuring that stale, unreferenced cache doesn’t linger indefinitely. The keep_storage parameter is a floor, not a target; BuildKit will not add cache to meet this floor, only refrain from deleting below it.

If you’ve configured your buildkitd.toml and restarted the BuildKit daemon, the next error you’ll encounter is likely related to insufficient disk space on the partition where /var/lib/docker/buildkit resides, if your GC settings are too permissive or your build churn is exceptionally high.

Want structured learning?

Take the full Buildkit course →