Consul’s distributed nature means you can upgrade it version by version without taking your services down.

Let’s say you’re running Consul 1.10.2 and want to get to 1.11.5. The key is that Consul is designed for rolling upgrades. You don’t upgrade all your Consul servers at once. Instead, you upgrade them one by one, allowing the cluster to maintain quorum and availability throughout the process. This applies to Consul clients as well; they’ll automatically reconnect to the upgraded servers.

Here’s a typical scenario: you have a Consul cluster with 5 servers. You’ll upgrade them sequentially.

1. Upgrade the First Server

  • Stop the Consul agent on one server.
  • Replace the binary with the new version (e.g., consul-1.11.5).
  • Start the Consul agent again.

The Consul agent will rejoin the cluster. Since the majority of servers are still running the older version, the cluster remains healthy. The upgraded agent will automatically fetch the latest configuration and state from the other servers.

2. Upgrade the Remaining Servers

Repeat the process for each of the other Consul servers, one at a time. Stop the agent, replace the binary, and restart the agent.

3. Upgrade Consul Clients

Consul clients are stateless with respect to the cluster’s consensus state. You can upgrade their binaries at any time, and they will seamlessly reconnect to the upgraded servers. It’s often easiest to do this after all servers have been upgraded.

What’s Happening Under the Hood?

Consul uses the Raft consensus algorithm. When you upgrade servers one by one, the cluster maintains quorum. Raft requires a majority of nodes to agree on state changes. As long as more than half of your servers are running a compatible version (which is true during a rolling upgrade between minor versions), the cluster can continue to operate. The newly upgraded server will sync its state with the cluster, and if it becomes the leader, it will serve requests.

The Crucial Detail: Version Compatibility

Consul’s rolling upgrade strategy works because newer versions are generally backward-compatible with older versions within the same major release. For example, you can roll from 1.10.2 to 1.10.5. However, when moving between major or minor releases (e.g., 1.10.x to 1.11.x, or 1.11.x to 1.12.x), the compatibility is maintained, but there’s a specific order and a point where the cluster must have at least one node running the new version for the upgrade to fully succeed across all nodes.

The most robust way to handle a minor version upgrade like 1.10.x to 1.11.x is to upgrade the servers first, one by one. Once all servers are running the new version, the cluster’s protocol version will be updated. At this point, clients running the older version will still work, but they won’t be able to take advantage of new features and might eventually need upgrading.

Example Configuration Snippet (Consul Server)

Let’s assume your Consul server is running with systemd.

# /etc/systemd/system/consul.service
[Unit]
Description=Consul
After=network.target

[Service]
ExecStart=/usr/local/bin/consul agent -config-dir=/etc/consul.d/
ExecStop=/usr/bin/killall consul
Restart=on-failure
User=consul
Group=consul

[Install]
WantedBy=multi-user.target

Upgrade Steps in Practice:

  1. Download the new Consul binary:

    wget https://releases.hashicorp.com/consul/1.11.5/consul_1.11.5_linux_amd64.zip
    unzip consul_1.11.5_linux_amd64.zip
    sudo mv consul /usr/local/bin/consul
    
  2. Stop the Consul agent on Server A:

    sudo systemctl stop consul
    
  3. Verify the agent is stopped:

    ps aux | grep consul
    

    (Ensure no consul agent process is running.)

  4. Start the Consul agent on Server A:

    sudo systemctl start consul
    
  5. Monitor the cluster health: Use the Consul UI or CLI to check the status of the cluster. You should see Server A rejoin the cluster and its version update.

    consul members -detailed
    

    Look for the Protocol version. Once Server A is back and healthy, its protocol version should reflect the new version.

  6. Repeat for Servers B, C, D, and E.

Important Considerations:

  • Configuration Directory: Ensure your configuration directory (/etc/consul.d/ in the example) is correctly specified in your service file. The Consul data directory (/opt/consul by default) should not be cleared.
  • Protocol Version: After all servers are upgraded, the cluster will automatically update its internal Raft protocol version. You can verify this with consul members -detailed.
  • Client Upgrades: Clients can be upgraded anytime. They will automatically discover the new server versions.
  • Testing: Always test your upgrade process in a staging environment that mirrors your production setup.
  • Backup: While rolling upgrades are designed to be non-disruptive, it’s always wise to have a backup of your Consul data directory before starting.

The next challenge you’ll face is figuring out how to upgrade Consul Connect features, which often involves separate steps for proxies and service definitions.

Want structured learning?

Take the full Consul course →