Causal consistency guarantees that if one operation "happens before" another, then all processes see them in that same order.

Let’s see this in action with a simple distributed chat application. Imagine two users, Alice and Bob, and a chat server.

Alice sends "Hello Bob!" to the server. Bob sends "Hi Alice!" to the server.

In a system with causal consistency:

  1. Alice sends "Hello Bob!" The server receives this and timestamps it.
  2. Bob sends "Hi Alice!" The server receives this and timestamps it.
  3. Server broadcasts "Hello Bob!" to Bob. Since Bob’s message happened after Alice’s message was sent (and thus "happened before" it was received by the server), the server ensures Bob sees "Hello Bob!" before he sees his own message "Hi Alice!".
  4. Server broadcasts "Hi Alice!" to Alice. Alice sees her own message first, then Bob’s.

The key is that if Bob reads Alice’s message and then sends his own, the causal link is preserved. The server knows Bob’s send action is causally dependent on Alice’s receive action.

This is different from strong consistency (where everyone sees operations in the exact same total order, regardless of causality) and eventual consistency (where there’s no guarantee on order, and replicas might diverge for a while).

How it works internally:

Causal consistency is often implemented using techniques like vector clocks or version vectors. A vector clock is a list of counters, one for each process in the system. When a process sends a message, it includes its current vector clock. When a receiving process gets a message, it updates its own vector clock based on the received one.

For example, if we have processes P1, P2, and P3: Initial state: V = [0, 0, 0]

P1 sends a message. Its clock becomes [1, 0, 0]. The message carries [1, 0, 0]. P2 receives this message. It updates its clock: V[2] = max(V[2], received_V[2]). So V becomes [1, 0, 0]. P2 then sends a message. Its clock becomes [1, 1, 0]. The message carries [1, 1, 0]. P1 receives this message. It updates its clock: V[1] = max(V[1], received_V[1]). So V becomes max([1, 0, 0], [1, 1, 0]) which is [1, 1, 0].

A message m1 happened before m2 if V(m1) is causally older than V(m2). This means V(m1)[i] <= V(m2)[i] for all i, and V(m1)[j] < V(m2)[j] for at least one j.

The problem it solves:

Causal consistency prevents certain types of "lost update" or "read-your-own-writes" anomalies that can occur in systems that are only eventually consistent. It’s about preserving the logical flow of events without the overhead of global ordering.

Consider a system where you can "like" a post. Alice likes post X. Bob sees Alice’s like and then likes post X himself.

With causal consistency:

  • Alice’s like is recorded.
  • Bob’s like is recorded.
  • The system ensures that if Bob saw Alice’s like before he sent his own, then any observer who sees Bob’s like must also see Alice’s like. This prevents a scenario where Bob sees his own like but misses Alice’s, which would be a logical inconsistency.

The levers you control:

When designing or using a system that offers causal consistency, you’re primarily concerned with how operations are tagged and propagated. You don’t typically "configure" causal consistency itself, but rather the underlying messaging or data structures (like vector clocks) that enable it. The "control" comes from understanding how your application logic interacts with these causal dependencies. For instance, if your application relies on a specific order of events that isn’t causally linked (e.g., two independent writes that you want to see in a specific global order), causal consistency alone won’t guarantee that.

A common pitfall is thinking causal consistency means you’ll always see the latest version of data. It only guarantees that if a state change led to another state change, the dependency is preserved. You might still read an older version of data if no direct causal link exists between your read and a more recent write.

The next concept you’ll grapple with is how to handle conflicting writes that are causally independent but still need a resolution strategy.

Want structured learning?

Take the full Distributed Systems course →