Docker’s COPY --link flag, introduced in Docker 20.10, lets you bypass the traditional image layer copying mechanism and instead create a symbolic link to the source file on your host machine. This can dramatically speed up builds, especially when copying large files or directories that haven’t changed between builds.

Let’s see this in action. Imagine we have a large dataset file we want to include in our Docker image.

# Dockerfile
FROM ubuntu:latest
COPY --link data.csv /app/data.csv
RUN echo "Data loaded." > /app/status.txt
CMD ["cat", "/app/data.txt"]

Now, let’s build this image twice, first without --link and then with it.

Build 1 (without --link)

# Create a dummy large file
dd if=/dev/zero of=data.csv bs=1M count=100

# Build the image
docker build -t my-app-nolink .
# Output will show the copy operation taking time, especially if data.csv is large.

Build 2 (with --link)

# Re-use the same data.csv file
docker build --no-cache -t my-app-link --link-local-copy=true .
# The --link-local-copy=true flag enables the --link functionality.
# Output will show the copy operation is almost instantaneous.

The difference in build time, particularly for the COPY instruction, will be stark. The --link flag doesn’t actually copy the file’s contents into the new image layer. Instead, it creates a symbolic link within the image’s filesystem that points to the file on your host machine. When the container runs, the filesystem driver resolves this symlink, presenting the file’s contents as if it were part of the image layer.

This might seem like magic, but it’s a clever filesystem trick. Docker leverages the underlying storage driver’s ability to create copy-on-write (COW) filesystems. When you use COPY --link, Docker doesn’t duplicate the data. It essentially creates a pointer. If the file on the host is modified after the image is built, and the container is run, the container will see the modified version. This is a critical point: COPY --link creates a live link to the host file at build time. If you need the file’s contents to be immutable within the image, you should not use --link.

The primary benefit is speed. If you have a large node_modules directory, a hefty dataset, or pre-compiled binaries that don’t change often, using COPY --link can prune minutes off your build times. It’s particularly effective in CI/CD pipelines where build speed is paramount.

To use it, you simply add --link to your COPY instruction:

COPY --link --chown=app:app my-large-directory /app/my-large-directory

You also need to enable the feature globally or per-build. For a single build, use --link-local-copy=true on the docker build command. To enable it for all builds on your Docker daemon, you can set the DOCKER_BUILDKIT_INLINE_CACHE environment variable to 1 or configure it in your Docker daemon settings.

A common misconception is that --link makes the copied file read-only within the container. This is not true. The file is still accessible and modifiable within the container’s runtime environment. The "link" is purely a build-time optimization to avoid data duplication. The actual data remains on the host filesystem.

When you’re dealing with build contexts that are frequently changing or very large, COPY --link becomes a significant performance enhancer. It fundamentally alters how Docker handles file copying by treating files as references rather than opaque blobs of data during the build process.

The next step after optimizing your file copying is to explore how to manage build cache invalidation more granularly.

Want structured learning?

Take the full Buildkit course →