OverlayFS is the default storage driver for Docker, and understanding it is key to grasping how containers actually manage their files and how layers are assembled. It’s not magic, it’s a clever filesystem trick.
Let’s see OverlayFS in action with a simple Alpine Linux container.
First, we need a base image. We’ll use alpine:latest.
docker pull alpine:latest
Now, let’s run a container from this image and create a file.
docker run -it --name my-alpine alpine:latest sh
Inside the container, create a file:
/ # echo "Hello from Alpine" > /hello.txt
/ # exit
Next, we’ll run another container from the same image and see that our file isn’t there, but the base filesystem is shared.
docker run -it --name another-alpine alpine:latest sh
Inside this container, check for the file:
/ # cat /hello.txt
cat: /hello.txt: No such file or directory
/ # ls /
bin dev etc hello.txt lib media mnt opt proc root run sbin srv sys tmp usr var
/ #
Wait, hello.txt is listed in /! This is where OverlayFS gets interesting. The file exists in the directory listing, but cat fails because it’s not in the lower layer. This is because the container’s filesystem is a union of multiple directories.
The core idea behind OverlayFS is a "union mount." Imagine you have two directories: a lowerdir and an upperdir. OverlayFS lets you present them as a single, unified filesystem. When you read a file, OverlayFS first checks the upperdir. If it’s there, it serves it from there. If not, it checks the lowerdir.
When you write to a file in the unified view, OverlayFS performs a "copy-on-write" operation. If the file exists only in the lowerdir, a copy is made to the upperdir before the write occurs. Subsequent reads will then find the modified file in the upperdir. If you delete a file that was in the lowerdir, OverlayFS creates a "whiteout" file in the upperdir to effectively hide the file from the lowerdir in the unified view.
Docker uses this for its layered images. Each layer in an image is a lowerdir. When you run a container, Docker creates a unique upperdir for it. The container’s root filesystem is then an OverlayFS mount combining all the image layers (bottom-up) with the container’s writable upperdir on top.
Here’s how to see this on your Docker host (paths may vary slightly depending on your Docker installation and OS, typically under /var/lib/docker/overlay2/).
First, find the container’s filesystem ID.
docker inspect my-alpine | grep \"GraphDriver\" -A 3
You’ll see something like:
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/a1b2c3d4e5f6...,/var/lib/docker/overlay2/f7e6d5c4b3a2...",
"MergedDir": "/var/lib/docker/overlay2/abcdef123456.../merged",
"UpperDir": "/var/lib/docker/overlay2/abcdef123456.../diff",
"WorkDir": "/var/lib/docker/overlay2/abcdef123456.../work"
},
"Name": "overlay2"
}
The MergedDir is the unified view you interact with inside the container. LowerDir contains the read-only layers of your image. UpperDir is the writable layer for this specific container. WorkDir is used by OverlayFS for internal operations.
The hello.txt file we created inside my-alpine is now in its UpperDir. The base Alpine filesystem is in the LowerDirs. When we ran another-alpine, it shared the same LowerDirs but had its own, empty UpperDir, hence hello.txt wasn’t visible.
The most surprising thing about OverlayFS is how it handles file deletion. When you rm /hello.txt in a container, it doesn’t actually delete the file from the LowerDir. Instead, OverlayFS creates a special "whiteout" file in the container’s UpperDir. This whiteout file tells the kernel, "pretend this file doesn’t exist, even if it’s in a lower layer." This is crucial for maintaining the immutability of image layers.
To see the whiteout for a deleted file, you’d need to inspect the UpperDir of a container where you’ve deleted a file from a lower layer. For example, if you ran a container, created a file in /etc/config.ini (which is in a lower layer), and then deleted it, you’d find a whiteout entry for /etc/config.ini in the container’s diff directory.
This layered, copy-on-write approach is what makes Docker images so efficient. Multiple containers can share the same base image layers, only storing the differences in their individual writable layers.
The next hurdle is understanding how Docker manages these layers and their storage on disk, especially when dealing with image cleanup and disk space.