The containerd Transfer Service, by default, operates in a "pull-based" model where each node independently fetches image layers from a registry.
Let’s see it in action. Imagine you have a private registry at my.private.registry.com and you want to distribute an image my-app:latest to several nodes.
First, on the node that has the image (let’s call it the "source"), ensure containerd is running and the image is present. You can verify this with:
sudo ctr images list
You’d see my.private.registry.com/my-app:latest in the output.
Now, on a "target" node that doesn’t have the image, you’d typically run:
sudo ctr images pull my.private.registry.com/my-app:latest
This is the standard pull operation. containerd on the target node connects to my.private.registry.com and downloads all necessary layers. This works fine for a few nodes, but if you have hundreds or thousands, it becomes a significant strain on your registry bandwidth and can lead to slow deployments.
This is where the Transfer Service shines. It allows nodes to share image layers directly with each other, bypassing the central registry for subsequent distributions.
To enable this, you need to configure containerd on your nodes. The relevant configuration is in /etc/containerd/config.toml. You’ll need to add or modify the [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] section to include TransferService. More critically, you need to enable the Transfer Service plugin itself in the main config.toml:
[plugins]
[plugins."io.containerd.grpc.v1.transfer"]
enabled = true
This tells containerd to activate the Transfer Service plugin.
Once enabled, containerd automatically starts advertising its available image layers. When a node needs an image that it doesn’t have locally, it will first check if any other nodes in its immediate network (or configured peers) have the required layers. If a peer is found with the data, containerd will attempt to transfer the layers directly from that peer.
The magic is that containerd doesn’t need to know about specific "peers" in a static sense. It uses a discovery mechanism. When a ctr images pull command is issued, and the Transfer Service is enabled, containerd queries its local network for other containerd instances advertising image data. If it finds one, it initiates a direct peer-to-peer transfer.
Here’s how the internal flow changes:
- Initial Pull: The very first node to pull
my.private.registry.com/my-app:latestwill still fetch it entirely from the registry. - Subsequent Pulls (with Transfer Service): A second node, when asked to
ctr images pull my.private.registry.com/my-app:latest, will first discover other containerd nodes. It finds the first node, which has the image. - Layer Discovery: The second node asks the first node, "Do you have layer X?"
- Direct Transfer: If the first node has layer X, it serves it directly to the second node over the containerd gRPC API. This happens for all layers of the image.
- Registry Fallback: If no peer is found with the required layers, or if the peer transfer fails, containerd falls back to pulling from the original registry.
This peer-to-peer distribution significantly reduces the load on your registry, especially in environments with many nodes or frequent image updates. It’s particularly effective for large images where downloading all layers from scratch for every new node is time-consuming.
The surprising thing about containerd’s Transfer Service is its implicit discovery. You don’t typically configure explicit peer lists. Instead, containerd nodes on the same broadcast domain or network segment will discover each other and advertise their available content. This makes it robust to dynamic environments where nodes join and leave.
The one thing most people don’t know is that this peer-to-peer transfer uses the same underlying content-addressable storage as the standard registry pull. When a layer is transferred, it’s stored in the same blob store on the receiving node as if it were pulled from a registry. This means there’s no extra overhead in terms of how containerd manages its image data locally; it just changes the source of that data.
The next concept you’ll likely encounter is optimizing this discovery and transfer process, perhaps by using dedicated distribution networks or more advanced peering strategies.