CockroachDB’s replication factor isn’t just about how many copies of your data exist; it’s about how many distinct failure domains those copies reside in.

Let’s see it in action. Imagine a three-node cluster: n1 in us-east-1a, n2 in us-east-1b, and n3 in us-east-1c.

-- Create a database and a table
CREATE DATABASE myapp;
USE myapp;
CREATE TABLE users (
    id INT PRIMARY KEY,
    name STRING
);

-- Set replication to 3 and zone config to spread across availability zones
ALTER DATABASE myapp ZONE CONFIG default {
    "num_replicas": 3,
    "constraints": [
        {"num_replicas": 1, "constraints": ["+region=us-east-1", "+zone=us-east-1a"]},
        {"num_replicas": 1, "constraints": ["+region=us-east-1", "+zone=us-east-1b"]},
        {"num_replicas": 1, "constraints": ["+region=us-east-1", "+zone=us-east-1c"]}
    ]
};

When you write data to the users table, CockroachDB ensures that three replicas of that data are created. The ZONE CONFIG dictates where these replicas must live. In this example, it mandates that one replica must be in us-east-1a, another in us-east-1b, and the third in us-east-1c. This setup guarantees that even if an entire availability zone (e.g., us-east-1a) goes offline, your data remains accessible because replicas exist in the other two zones.

The core problem this solves is high availability and fault tolerance in distributed systems. Traditional databases often replicate data synchronously within a single data center, meaning a failure of that data center can lead to downtime. CockroachDB’s zone configurations allow you to distribute replicas across multiple physical locations (like availability zones or even regions), making your application resilient to infrastructure failures.

Internally, CockroachDB uses Raft consensus for each data range (a contiguous subset of your data). The num_replicas setting determines how many nodes participate in the Raft group for that range. The constraints in the zone configuration act as policy. When a new replica needs to be created or moved, CockroachDB’s replication manager consults these constraints. It identifies nodes that satisfy the specified conditions (e.g., belong to a certain region or zone) and attempts to place replicas on them. If a node fails, the replication manager detects the missing replica and, based on the zone configuration, initiates the creation of a new replica on a suitable available node to maintain the desired replication factor and distribution.

You control this behavior through ZONE CONFIG. The default zone config applies to all databases and tables unless overridden. You can create specific zone configs for individual databases or even tables. The num_replicas is the total count of copies for a data range, while the constraints array specifies the distribution policy. Each entry in the constraints array specifies a number of replicas and a set of conditions. CockroachDB tries to satisfy these conditions sequentially, prioritizing the earlier entries. For example, {"num_replicas": 1, "constraints": ["+region=us-east-1", "+zone=us-east-1a"]} means "place 1 replica such that it is in the us-east-1 region AND in the us-east-1a zone." The sum of num_replicas in the constraints must equal the total num_replicas for the zone.

The most surprising thing about CockroachDB’s replication and zone configuration is how it handles transient network partitions or node failures when constraints are very strict. If, for instance, you have a zone configuration requiring replicas in three distinct physical locations and all nodes in one of those locations become unreachable, CockroachDB will not immediately relocate replicas to satisfy the count. Instead, it will wait for the original location to become available again to maintain the integrity of its distribution policy. This is because the constraints are treated as hard requirements for placement, not just suggestions. The system prioritizes adhering to the defined topology over a potentially temporary "fix" that might violate the intended distribution. This ensures your data is always placed according to your architectural design, even if it means a temporary data unavailability if the required topology cannot be met.

The next concept you’ll wrestle with is how to achieve geo-partitioning, where specific data is pinned to specific geographical regions for compliance or performance reasons.

Want structured learning?

Take the full Cockroachdb course →