etcd’s compare-and-swap (CAS) transactions are the bedrock for building robust, distributed state machines, not just simple key-value storage.

Let’s watch this in action. Imagine a distributed counter. Multiple clients might try to increment it concurrently. Without CAS, you’d have a race condition: two clients read the value 5, both increment it to 6, and write 6 back. The counter should be 7, but it’s 6. CAS prevents this.

Here’s how a CAS transaction works in etcd: a transaction is a set of conditions and a set of actions. The conditions are checks on the current state of keys (e.g., "does key counter have the value 5?"). If all conditions are true, then all the actions are executed atomically (e.g., "increment counter by 1" and "set status to success"). If any condition is false, no actions are executed. The entire operation is atomic: either all actions happen, or none do.

Consider a simple counter increment. We want to ensure that we only increment the counter if its current value is what we expect.

# Initial state: counter = 5
etcdctl put counter 5

# Client A tries to increment
etcdctl txn --key=/counter --compare=value=5 --action=/counter,put,6 --action=/status,put,success
# Output:
# {"header":{"cluster_id":...,"member_id":...,"revision":2,"raft_term":...},"success":true}

# Now, counter is 6, status is success.
etcdctl get counter status
# Output:
# counter  5
# status  success

# Client B tries to increment, but the value is no longer 5
etcdctl txn --key=/counter --compare=value=5 --action=/counter,put,6 --action=/status,put,success
# Output:
# {"header":{"cluster_id":...,"member_id":...,"raft_term":...},"success":false}

# Counter and status remain unchanged.
etcdctl get counter status
# Output:
# counter  6
# status  success

The txn command is your primary tool here. The --key flag specifies the key to check. The --compare flag defines the condition. Here, value=5 means "the current value of /counter must be 5". If this condition passes, the --action flags are executed. /counter,put,6 means "set the value of /counter to 6". /status,put,success means "set the value of /status to success".

The power comes from combining multiple conditions and actions. Imagine a simple leader election. A node wants to become the leader. It can only become the leader if there is no current leader.

# No leader initially
etcdctl get leader

# Node 1 attempts to become leader
etcdctl txn \
  --key=/leader \
  --compare=value= \
  --action=/leader,put,node1 \
  --action=/status,put,leader
# Output:
# {"header":{"cluster_id":...,"member_id":...,"revision":2,"raft_term":...},"success":true}

# Node 1 is now the leader.
etcdctl get leader status
# Output:
# leader  node1
# status  leader

# Node 2 attempts to become leader
etcdctl txn \
  --key=/leader \
  --compare=value= \
  --action=/leader,put,node2 \
  --action=/status,put,leader
# Output:
# {"header":{"cluster_id":...,"member_id":...,"raft_term":...},"success":false}

# Node 2 fails to become leader because /leader is not empty.
etcdctl get leader status
# Output:
# leader  node1
# status  leader

In this leader election example, --compare=value= checks if the value of /leader is empty. If it is, then the actions are executed, setting /leader to node1 and /status to leader. If /leader already has a value (meaning another node is the leader), the transaction fails.

You can also compare against revisions. This is useful for ensuring you’re acting on the most up-to-date information. For example, if you read a configuration, and then want to update it only if it hasn’t changed since you read it:

# Initial config
etcdctl put config '{"version": 1, "setting": "abc"}'

# Read config and its revision
# Let's assume we read it and got revision 3
# Now we want to update it, but only if it's still revision 3
etcdctl txn \
  --key=/config \
  --compare=version=3 \
  --action=/config,put,'{"version": 2, "setting": "xyz"}'

The --compare=version=3 checks if the revision associated with the /config key is exactly 3. If it is, the new configuration is written. If another client updated /config in the meantime, its revision would be higher than 3, and this transaction would fail.

The --action flag supports put, delete, and compaction. You can also use create for actions, which is like put but fails if the key already exists.

The truly counterintuitive aspect of etcd transactions is their implicit retry mechanism within the client. When a transaction fails due to a fail condition (meaning a condition was not met), the client can re-read the state and retry the transaction. etcd itself doesn’t loop for you; it just guarantees atomicity for each individual transaction attempt. The application logic must decide when and how to retry.

When you start building more complex distributed systems that require strong consistency guarantees, you’ll quickly find yourself relying on etcd’s CAS transactions for everything from consensus protocols to distributed locking.

The next hurdle is understanding how to manage etcd’s storage lifecycle, particularly with compaction and defragmentation, to prevent unbounded growth.

Want structured learning?

Take the full Etcd course →