Hinted handoff is Cassandra’s way of making sure your writes don’t get lost when a node is temporarily down, acting like a temporary notary for data that can’t reach its final destination.
Let’s watch it in action. Imagine a simple users table:
CREATE TABLE users (
user_id UUID PRIMARY KEY,
username text,
email text
);
We’re writing a new user with user_id a1b2c3d4-e5f6-7890-1234-567890abcdef. This user_id hashes to a specific token range. Let’s say Cassandra has three replicas for this token range: nodes node1, node2, and node3.
Normally, when you write INSERT INTO users (user_id, username, email) VALUES (a1b2c3d4-e5f6-7890-1234-567890abcdef, 'alice', 'alice@example.com'), the coordinator node (node1 in this case) would send this write to node2 and node3 (assuming node1 is also a replica).
Now, what if node3 is down for maintenance? The coordinator (node1) tries to send the write to node2 and node3. node2 acknowledges the write. node3 doesn’t respond.
Instead of just failing the write for the user and telling you "write failed, one replica down," Cassandra’s hinted handoff kicks in. The coordinator (node1) realizes node3 is unavailable. It still needs to ensure the data eventually gets to node3 when it comes back online. So, node1 temporarily stores a "hint" for node3. This hint is essentially a record saying, "Hey, when you (node3) come back, you missed a write for this specific token range and this specific data."
node1 then considers the write successful because it has been written to at least one replica (node2) and a hint has been created for the unavailable replica (node3). You, the client, get a success response.
Once node3 comes back online, Cassandra’s gossip protocol informs the other nodes. node1 (or another node that might have received the hint) will then look at its stored hints. It sees it has a hint for node3 and the data that node3 missed. node1 then sends the actual data to node3. This is called "hinted handoff delivery." node3 applies the write, and now all replicas are consistent again.
The "hint" itself is stored on the node that received the write and is responsible for sending it to the unavailable node. It’s a small piece of metadata that tracks which nodes are down and what data they missed. These hints have a Time-To-Live (TTL) of typically 3 hours (controlled by hinted_handoff_throttle_delay_in_ms, which is actually a delay, not a TTL, but it indirectly limits how long hints persist if not delivered). If a node is down for longer than this, the hints might expire, and Cassandra relies on read repair to eventually fix inconsistencies.
The key configuration for hinted handoff is hinted_handoff_enabled: true in cassandra.yaml. This is usually enabled by default. Another related setting is max_hint_window_in_ms, which defaults to 10800000ms (3 hours). This is the maximum amount of time a node will store hints for an unavailable node.
The surprising thing about hinted handoff is that the "hint" isn’t a separate, special data structure. It’s essentially a regular Cassandra write operation to a special internal table (system_traces.hints). This means the hints themselves are replicated and managed like any other data, providing resilience for the handoff mechanism itself.
The next concept you’ll run into is how Cassandra handles data that has been unavailable for extended periods, leading to situations where hinted handoff might not be enough and read repair becomes critical.