containerd’s stargz remote snapshotter lets you pull container images lazily, meaning you only download the parts of an image you actually need when a container starts.
Let’s see it in action. Imagine you have a large image, say ghcr.io/stargz-containers/stargz-snapshotter:latest, which is around 1GB.
# First, ensure you have containerd installed and configured with stargz snapshotter.
# This typically involves editing /etc/containerd/config.toml to include:
#
# [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
# SystemdCgroup = true
#
# And in the [plugins."io.containerd.grpc.v1.cri".containerd.snapshotter] section,
# ensure you have something like:
# `type = "stargz"`
#
# You might need to restart containerd after these changes:
# sudo systemctl restart containerd
# Now, let's pull the image. This command will appear to complete almost instantly
# because it only downloads the manifest and the stargz index.
ctr images pull ghcr.io/stargz-containers/stargz-snapshotter:latest
# Inspecting the image shows its size, but the actual layers aren't fully downloaded yet.
ctr images inspect ghcr.io/stargz-containers/stargz-snapshotter:latest
# When you run a container from this image, containerd will transparently pull
# the necessary layers on demand.
ctr run --rm ghcr.io/stargz-containers/stargz-snapshotter:latest alpine echo "Hello from stargz!"
# You can observe network traffic during the 'ctr run' command to see data being fetched.
# If you stop and restart the container, or start another one from the same image,
# the already-pulled layers will be reused, and only new or missing parts will be fetched.
The core problem stargz solves is the significant delay and bandwidth usage associated with pulling large container images, especially in environments with high container churn or slow network connections. Traditional image pulling downloads entire layers upfront, even if only a small fraction of the image is ever used by a running container. This is particularly wasteful for multi-gigabyte images or when running many different containers that share common base layers but diverge only slightly.
Stargz (which stands for "Star" for its association with the stargz-snapshotter and "gz" for gzip compression) tackles this by using a specialized image format called estargz. An estargz file is essentially a tar.gz archive with an added index. This index contains metadata about the archive’s contents, including the location and size of individual files or chunks within the archive. The stargz-snapshotter leverages this index to only download specific chunks of the image on demand. When a container process tries to access a file that hasn’t been downloaded yet, the snapshotter intercepts the read request, consults the estargz index, fetches the required data chunk from the remote registry, and then allows the read operation to proceed. This "lazy pulling" dramatically reduces the initial pull time and the overall data transferred, especially for read-heavy workloads or when only a subset of an image’s files is actually executed.
The stargz-snapshotter acts as a transparent proxy for the container runtime’s filesystem access. It registers itself as a snapshotter plugin for containerd. When containerd needs to prepare a filesystem for a new container based on an estargz image, it delegates the task to the stargz-snapshotter. The snapshotter then mounts the estargz archive in a way that allows on-demand fetching of its contents. The key is that the snapshotter doesn’t unpack the entire image to the host’s storage initially. Instead, it creates a virtual filesystem where files are only materialized (downloaded and written to disk) when they are first accessed by the container. Subsequent accesses to the same file, if it has already been fetched, will be served directly from the local cache.
You control the behavior primarily through the image itself. Images need to be built or converted into the estargz format. Tools like buildah or nerdctl can build estargz images directly. Alternatively, you can convert existing Docker-compatible images using the estargz-tools package, specifically the tar2estargz command. For example:
# Convert a standard Docker image tarball to estargz
docker save my-image:latest -o my-image.tar
tar2estargz my-image.tar my-image.estargz
# Then, push this estargz file to a registry that supports it, or use it locally.
# The 'stargz-snapshotter' then knows how to interpret this specialized format.
The stargz-snapshotter itself has minimal configuration, mostly related to its integration within containerd’s config.toml. The primary "levers" are the image format and the network connectivity to the registry.
One aspect that often trips people up is how caching works with estargz. The stargz-snapshotter maintains a local cache of fetched chunks. This cache resides within containerd’s snapshotter directory (e.g., /var/lib/containerd/io.containerd.snapshotter.v1.stargz/). When you run ctr images clean or docker system prune, it might not automatically clean the stargz cache because it’s managed by a separate snapshotter plugin. To explicitly clear the stargz cache, you might need to manually remove directories within /var/lib/containerd/io.containerd.snapshotter.v1.stargz/ or re-initialize the snapshotter, which can be a more involved operation.
The next thing you’ll likely encounter is understanding how estargz interacts with mutable files within a container and how to manage the layer lifecycle when using this snapshotter.