etcd’s disk usage grows over time because it keeps old versions of keys, and unless you tell it otherwise, it will keep them forever.

Let’s see etcd in action. Imagine we have a simple key-value pair:

# Put a key
curl -L http://127.0.0.1:2379/v3/put -d '{"key": "dGVzdGtleQ==", "value": "dGVzdHZhbHVl"}'

# Get the key
curl -L http://127.0.0.1:2379/v3/get -d '{"key": "dGVzdGtleQ=="}'

Now, let’s modify it:

# Modify the key
curl -L http://127.0.0.1:2379/v3/put -d '{"key": "dGVzdGtleQ==", "value": "bmV3dmFsdWUx"}'

And again:

# Modify the key again
curl -L http://127.0.0.1:2379/v3/put -d '{"key": "dGVzdGtleQ==", "value": "bmV3dmFsdWUy"}'

Even though we only see the latest value when we GET, etcd is storing all previous versions of testkey. This is crucial for features like distributed consensus, leases, and watches. Without these historical versions, etcd couldn’t reliably determine the state of the cluster at any given point in time or notify clients of changes.

The problem arises when these old versions accumulate without bound. etcd stores its data in a bolted bbolt database. When you update a key, etcd doesn’t overwrite the old data in place. Instead, it appends the new revision. Over time, this leads to fragmentation and growth of the database file, even if the number of currently active keys is small.

To manage this, etcd provides two primary mechanisms: compaction and defragmentation.

Compaction is like garbage collection for old etcd revisions. It removes historical data that is older than a specified revision number. You tell etcd, "I no longer need any data before revision X." etcd then marks all keys associated with revisions less than X as deleted, but the actual space isn’t reclaimed from the bbolt database file immediately.

Defragmentation, on the other hand, is the process of actually reclaiming the disk space occupied by the deleted data within the bbolt database file. It’s like running TRIM on an SSD. etcd rewrites the database file, consolidating the remaining data and discarding the free space.

Here’s how you perform these operations. First, you need to know the current revision of your etcd cluster.

curl -L http://127.0.0.1:2379/v3/meta/leader/revision -s | jq .revision

Let’s say this outputs 12345. Now, to compact etcd, you’d issue a command like this, specifying a revision before which you want to discard data. A common strategy is to compact up to the current revision minus a certain number of revisions (e.g., 1000) to ensure you don’t delete data that’s actively being used by watches or leases.

# Compact etcd, keeping revisions from 123000 onwards
ETCDCTL_API=3 etcdctl compact 123000

This command tells etcd to mark all key-value pairs with revisions less than 123000 as deleted. The etcdctl tool is the command-line interface for interacting with etcd. The ETCDCTL_API=3 environment variable ensures you’re using the v3 API, which is standard.

After compaction, the etcd data directory size might not immediately decrease. You need to run defragmentation to reclaim the space.

# Defragment etcd
ETCDCTL_API=3 etcdctl defrag

This command triggers etcd to rewrite its internal database file, removing the "holes" left by compaction. The actual size reduction will be visible after this command completes.

It’s a good practice to automate these operations. You can set up a cron job to periodically compact and defragment etcd. A reasonable schedule might be to compact once an hour and defragment once a day, or even less frequently depending on your write load.

For example, to compact every hour up to the last 1000 revisions:

# In your crontab:
0 * * * * ETCDCTL_API=3 etcdctl compact $(($(ETCDCTL_API=3 etcdctl alarm get | grep -oE '[0-9]+' | head -1) - 1000)) > /dev/null 2>&1

And to defragment daily:

# In your crontab:
0 3 * * * ETCDCTL_API=3 etcdctl defrag > /dev/null 2>&1

You might notice that after compaction, etcd might report an "NOSPACE" alarm if the underlying filesystem is also full, even though etcd itself has freed up space internally. Running defrag resolves this.

When you compact etcd using a revision number that is lower than the current revision, etcd will only remove entries from its history that are older than that specified revision. It will not remove entries that are newer than your specified revision, even if they are marked for deletion due to an update. The etcdctl compact command is idempotent; running it multiple times with the same revision number will have no additional effect once the compaction has already occurred for that revision.

The next thing you’ll likely encounter is managing etcd’s performance under heavy write loads, which often involves tuning its garbage collection parameters and understanding the impact of leases on revision management.

Want structured learning?

Take the full Etcd course →