etcd’s leader election is surprisingly more about distributed consensus than just picking one node.

Let’s say you have a distributed system where you need exactly one instance of a service to perform a critical task, like writing to a shared database or managing a cache. You can’t have two instances doing it simultaneously, nor can you afford to have no instance doing it. This is where leader election comes in. etcd provides a robust mechanism for this.

Imagine a set of worker nodes. We want one of them to be the "leader" and the rest to be "followers." The leader will do the work, and if it fails, one of the followers must immediately take over.

Here’s a simplified Go program demonstrating this. We’ll use etcd’s concurrency package, which abstracts away much of the low-level detail.

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	clientv3 "go.etcd.io/etcd/client/v3"
	"go.etcd.io/etcd/client/v3/concurrency"
)

func main() {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints: []string{"localhost:2379"}, // Your etcd endpoint
	})
	if err != nil {
		log.Fatal(err)
	}
	defer cli.Close()

	session, err := concurrency.NewSession(cli, concurrency.WithTTL(10)) // Session with 10s TTL
	if err != nil {
		log.Fatal(err)
	}
	defer session.Close()

	// Create a new elector and try to acquire a lock for "my-lock"
	elector := concurrency.NewMutex(session, "/my-app/leader-lock")

	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	fmt.Println("Attempting to acquire leader lock...")
	if err := elector.Lock(ctx); err != nil {
		log.Fatalf("Failed to acquire lock: %v", err)
	}

	fmt.Println("Successfully acquired leader lock! I am the leader.")

	// Simulate leader work
	leaderCh := make(chan struct{})
	go func() {
		ticker := time.NewTicker(5 * time.Second)
		defer ticker.Stop()
		for {
			select {
			case <-ticker.C:
				fmt.Println("Leader heartbeat...")
			case <-ctx.Done():
				fmt.Println("Leader context done.")
				return
			}
		}
	}()

	// Keep the leader running until the context is cancelled or the session expires
	<-session.Done()
	fmt.Println("Leader session ended, relinquishing leadership.")
	// The lock is automatically released when the session is closed or expires.
}

// To run a follower, you'd use a similar setup but instead of Lock(),
// you'd use TryLock() or a blocking Lock() that continues if the first fails.
// The key is that only one process will successfully acquire the Lock() initially.
// If the leader fails (session expires), its lock is released, and another follower can acquire it.

When you run multiple instances of this code, only one will print "Successfully acquired leader lock! I am the leader." The others will block on elector.Lock(ctx) until the leader’s session expires or is explicitly closed.

The core problem this solves is ensuring state consistency and preventing race conditions in distributed systems. Without a single, agreed-upon leader, concurrent operations could lead to data corruption or conflicting actions.

Internally, etcd uses a distributed consensus algorithm (Raft) to ensure that all nodes agree on the state of the system, including who currently holds the leader lock. The concurrency package builds on this. A Session is a lease in etcd with a Time-To-Live (TTL). When you create a Mutex (which is what elector is), you’re essentially creating a key in etcd that is owned by this session. Acquiring the lock means successfully creating this key. If the client holding the lock disconnects or its session expires (because heartbeats weren’t sent to etcd within the TTL), etcd automatically removes the key. This removal signals to other waiting clients that the lock is now free and they can attempt to acquire it.

The concurrency.NewSession function is crucial. The concurrency.WithTTL(10) means the session will automatically expire after 10 seconds of inactivity. This is the lifeline of the leader. If the leader process crashes or becomes unresponsive, it stops sending heartbeats to etcd to keep its session alive. etcd, after the TTL elapses, considers the session dead and revokes all resources associated with it, including the leader lock. This automatic cleanup is what allows for failover.

What most people don’t realize is the subtle dance between the session TTL and the leader’s heartbeat. The leader must periodically renew its session with etcd. This is often handled implicitly by the concurrency package’s session management, but understanding that the lock is tied to the session’s liveliness, not just the Lock() call, is key. If the leader’s renewal mechanism fails before the session TTL expires, the lock might be released prematurely.

The next hurdle is implementing the actual "work" that only the leader should do, and ensuring followers can gracefully transition to doing nothing.

Want structured learning?

Take the full Etcd course →