Cassandra doesn’t actually distribute data evenly across nodes; it distributes tokens, and the data follows.

Let’s see it in action. Imagine a simple cluster with two nodes, node1 and node2, and a keyspace with a replication factor of 1.

nodetool ring

This command shows you the tokens assigned to each node. You’ll see something like this:

Datacenter: datacenter1
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                                      Rack
UN  10.0.0.1   100.00 KiB  100     100.00%           a1b2c3d4-e5f6-7890-1234-567890abcdef         rack1
UN  10.0.0.2   100.00 KiB  100     100.00%           f9e8d7c6-b5a4-3210-fedc-ba9876543210         rack1

Here, each node has 100 tokens. This means the entire token range, typically from Long.MIN_VALUE to Long.MAX_VALUE (or a subset depending on configuration), is divided into 100 contiguous segments. node1 is responsible for a set of these segments, and node2 for the remaining ones. When you write data, Cassandra hashes your row key to get a token. Whichever node "owns" that token range receives the data.

The key here is that tokens are what get distributed, not directly the data volume. If you have a hot row key that hashes to a token range owned by node1, node1 will receive a disproportionate amount of data, even if node2 has plenty of free space.

The Problem Solved: Scalable Writes and Reads

Cassandra’s token-based distribution is its core mechanism for achieving horizontal scalability. Instead of a single master managing all data, each node is responsible for a specific range of tokens. This allows you to add more nodes to the cluster and have them take ownership of new token ranges, thereby increasing the cluster’s capacity for both storage and throughput.

When you perform a read or write, Cassandra first hashes the row key to determine its token. It then consults its internal token map to identify which node(s) are responsible for that token range. This lookup is extremely fast. For writes, the coordinator node sends the data to the replica(s) responsible for the token. For reads, the coordinator requests the data from the replica(s) and can even perform read repair if inconsistencies are detected.

Internal Mechanics: Hashing and Ring Management

Every row in Cassandra is assigned a token. This token is generated by applying a consistent hashing algorithm (like Murmur3) to the row key. The output of this hash function is a 64-bit integer.

The cluster maintains a "ring" where these tokens are arranged in a circular fashion. Each node in the cluster is assigned a set of these tokens, effectively owning the contiguous range of tokens between its own assigned tokens and the next token in the ring (in clockwise order).

For example, if node1 has tokens T1 and T3, and node2 has T2 (where T1 < T2 < T3), then:

  • node1 owns tokens from T3 up to (but not including) T1 in the ring.
  • node2 owns tokens from T1 up to (but not including) T2.

The nodetool ring command visualizes this. The "Tokens" column often shows the number of tokens a node is responsible for, not the actual token values. In older Cassandra versions or specific configurations, you might see actual token values if you use nodetool gossipinfo or inspect node configuration files.

When you add a new node to the cluster, Cassandra redistributes some token ranges from existing nodes to the new node. This process is called "streaming." The new node joins the ring, and existing nodes identify which of their tokens now fall within the ranges owned by the new node. Data corresponding to these tokens is then streamed from the old nodes to the new one.

The Counterintuitive Truth About Hotspots

You might think that if you have many nodes, data will be distributed perfectly evenly. This is not necessarily true. While Cassandra distributes tokens evenly (especially with many tokens per node), the data follows the tokens. If your application generates row keys that, when hashed, all fall within a small contiguous range of tokens, those tokens will be owned by a small subset of nodes. This leads to "hotspots" where a few nodes handle a significantly larger portion of the read/write traffic and storage.

For instance, if your row keys are sequential integers like 1000000, 1000001, 1000002, and they all hash to tokens owned by node1, node1 will be overloaded while other nodes are idle. This is why choosing a good, randomizing row key or using a composite partition key with a random element is crucial for avoiding data skew and ensuring even distribution. The primary goal is to ensure the hash of your partition key is spread across the entire token range.

The next concept you’ll likely grapple with is how replication factor impacts read and write paths and fault tolerance.

Want structured learning?

Take the full Cassandra course →