Redis Cluster is surprisingly resilient, but its high availability doesn’t come from replicating data to a single standby. Instead, it uses a system of sharding and master-less failover across multiple independent nodes.
Let’s see it in action. Imagine you have a simple Redis setup and you want to make it highly available. You’d start with a few Redis instances:
# Start node 1 on port 7000
redis-server --port 7000 --cluster-config-file nodes-7000.conf --cluster-save-config
# Start node 2 on port 7001
redis-server --port 7001 --cluster-config-file nodes-7001.conf --cluster-save-config
# Start node 3 on port 7002
redis-server --port 7002 --cluster-config-file nodes-7002.conf --cluster-save-config
Now, you need to tell these nodes they’re part of a cluster. You’ll need at least six nodes for a production-ready setup (three masters and three replicas), but for demonstration, we’ll start with three and add replicas later. First, create the cluster, assigning slots to masters:
redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 --cluster-replicas 0
This command will prompt you to confirm the slot distribution. Redis will divide its 16384 hash slots among the masters. For instance, node 7000 might get slots 0-5460, 7001 slots 5461-10922, and 7002 slots 10923-16383.
Once the cluster is created, you can add replicas to provide fault tolerance. If a master goes down, one of its replicas will be promoted to take its place.
redis-cli --cluster add-node 127.0.0.1:7003 127.0.0.1:7000 --cluster-slave --cluster-master-node <master_node_id>
You’d repeat this for each master, assigning a replica to it. The <master_node_id> is the unique ID of the master node, which you can find using redis-cli -c -p 7000 CLUSTER NODES.
The core problem Redis Cluster solves is scaling your cache beyond the memory of a single machine while maintaining availability. It achieves this by sharding your data across multiple master nodes. Each master is responsible for a subset of the 16384 hash slots. When a client connects, it’s routed to the correct node based on the hash of the key.
Internally, this routing is managed by a gossip protocol. Each node periodically exchanges information about cluster state, including which nodes are masters, which are replicas, and which hash slots each master owns. When a key is requested, if the client is on the wrong node, that node will respond with a MOVED redirect, telling the client the correct node to contact. Clients are expected to cache this slot-to-node mapping to avoid repeated redirects.
The actual levers you control are primarily in your redis.conf and the redis-cli --cluster commands. Key redis.conf settings include:
cluster-enabled yes: Essential to enable cluster mode.cluster-config-file nodes-<port>.conf: This file is automatically managed by Redis and stores the cluster topology. Do not edit it manually.cluster-port <port>: If you want your cluster bus port to be different from your client port (e.g.,port 6379,cluster-port 16379). The cluster bus uses a separate port for inter-node communication.
The redis-cli --cluster commands are your primary interface for managing the cluster:
create: Initializes a new cluster.add-node: Adds a new node (master or replica).del-node: Removes a node.reshard: Moves hash slots between masters, crucial for rebalancing.failover: Manually initiates a failover for a specific master.
The most surprising aspect of Redis Cluster’s failover is that it’s not a primary-replica hot-standby system in the traditional sense. When a master node becomes unreachable for a configurable period (cluster-node-timeout), its replicas enter a "simulation" phase. They query other masters to see if they believe the master is truly down. If a majority of masters agree, one of the replicas is elected by a simple majority vote among the remaining masters to become the new master. This distributed consensus mechanism allows for failover without a central coordinator.
The next concept to grapple with is how to handle Redis Cluster when your application needs to scale writes beyond what a single master can handle, even with sharding.