Consul’s high availability isn’t about having multiple Consul servers; it’s about having multiple voting Consul servers that form a consensus group.
Let’s see it in action. Imagine we have three Consul servers, consul-1, consul-2, and consul-3, all running in the same datacenter. They’re configured to talk to each other.
# consul-1
consul agent -server -node=consul-1 -bootstrap-expect=3 -retry-join=consul-2 -retry-join=consul-3 -datacenter=dc1 -config-dir=/etc/consul.d
# consul-2
consul agent -server -node=consul-2 -bootstrap-expect=3 -retry-join=consul-1 -retry-join=consul-3 -datacenter=dc1 -config-dir=/etc/consul.d
# consul-3
consul agent -server -node=consul-3 -bootstrap-expect=3 -retry-join=consul-1 -retry-join=consul-2 -datacenter=dc1 -config-dir=/etc/consul.d
Here, -bootstrap-expect=3 tells Consul that we expect 3 servers to form the initial cluster. -retry-join is the crucial part for discovery. Once a server starts, it tries to find the others using these addresses.
When you run consul members, you’ll see something like this, with Status showing alive and Role showing server:
Node Client Addr Node Addr Status Consensus Build Version SE Port
consul-1 10.0.1.10:8500 10.0.1.10:8301 alive true 1.1.0 1.1.0 8302
consul-2 10.0.1.11:8500 10.0.1.11:8301 alive true 1.1.0 1.1.0 8302
consul-3 10.0.1.12:8500 10.0.1.12:8301 alive true 1.1.0 1.1.0 8302
The Consensus column being true is the indicator of a healthy, participating server in the Raft consensus group.
The problem this solves is ensuring that Consul’s control plane remains available even if one or more servers fail. Consul uses the Raft consensus algorithm to manage its state. For any write operation (like registering a service or updating a health check), a majority of the voting servers must agree. This means if you have 3 servers, you can tolerate 1 failure. If you have 5 servers, you can tolerate 2 failures.
Internally, each server in the consensus group maintains a replicated log of all state changes. When a change is proposed, it’s sent to the leader. The leader then replicates it to a majority of the followers. Once a majority acknowledges the change, it’s committed and applied to the state machine on all servers. This process guarantees that all servers eventually agree on the same state, even in the presence of network partitions or server failures, as long as a majority can communicate.
The retry-join mechanism is key for initial bootstrapping and for recovery. If a server restarts or a new server is added, it uses retry-join to discover existing cluster members and attempt to rejoin the consensus group. This is where redundancy zones come into play. Instead of retry-join pointing to servers within the same physical rack or even the same availability zone, you’d point it to servers spread across different zones.
For example, if you have three datacenters: us-east-1a, us-east-1b, and us-east-1c, you’d configure your Consul servers in each zone to join servers in the other zones.
Consider a setup with one Consul server in each of three availability zones, az-a, az-b, and az-c.
// config.json on consul-az-a
{
"server": true,
"datacenter": "us-east-1",
"bootstrap_expect": 3,
"retry_join": [
"10.0.2.10", // IP of consul-az-b
"10.0.3.10" // IP of consul-az-c
]
}
// config.json on consul-az-b
{
"server": true,
"datacenter": "us-east-1",
"bootstrap_expect": 3,
"retry_join": [
"10.0.1.10", // IP of consul-az-a
"10.0.3.10" // IP of consul-az-c
]
}
// config.json on consul-az-c
{
"server": true,
"datacenter": "us-east-1",
"bootstrap_expect": 3,
"retry_join": [
"10.0.1.10", // IP of consul-az-a
"10.0.2.10" // IP of consul-az-b
]
}
This configuration ensures that if az-a experiences a complete outage, the Consul servers in az-b and az-c can still communicate, maintain quorum (2 out of 3), and continue serving requests. The datacenter field is important here – it groups these servers logically within the same administrative domain, even though they are physically separated.
Most people don’t realize that Consul’s Raft implementation has a default heartbeat_timeout of 10 seconds. If a leader doesn’t hear from a majority of followers within this timeframe, it will step down, and a new leader election will begin. In multi-datacenter setups, especially those with higher network latency, this default can lead to unnecessary leader elections and instability if not tuned appropriately. You might need to increase this value in your server configuration to accommodate longer network delays between your Consul servers.
The next step is typically understanding how to manage service registration and discovery across multiple datacenters.