The most surprising truth about data replication is that "consistency" is a spectrum, not a binary state, and the trade-offs you make for speed can fundamentally alter what "data" even means.
Let’s see this in action. Imagine two database nodes, db-primary and db-replica, in a distributed system. We’re writing a simple record:
{
"id": 123,
"name": "Acme Corp",
"status": "active"
}
Synchronous Replication (Sync)
In sync replication, when a client sends a write request to db-primary, db-primary must confirm that the write has been successfully applied to db-replica before acknowledging the write to the client.
Here’s a simplified flow:
- Client sends
INSERT {id: 123, ...}todb-primary. db-primarywrites the data to its local disk.db-primarysends the data todb-replica.db-replicawrites the data to its local disk.db-replicasends an acknowledgment back todb-primary.db-primarysends an acknowledgment back to the client.
Why it matters: If db-replica crashes after db-primary has acknowledged the write, but before db-replica has received or applied it, the data is lost. Sync replication prevents this scenario. The client only gets a "success" when the data is durably stored on both nodes.
Configuration Example (Conceptual - PostgreSQL WAL Shipping):
On db-primary (postgresql.conf):
wal_level = replica
max_wal_senders = 5
synchronous_commit = on
synchronous_standby_names = 'db-replica_name' # Name of the replica in pg_ident.conf or similar
On db-replica:
hot_standby = on
Asynchronous Replication (Async)
In async replication, db-primary acknowledges the write to the client as soon as it has written the data locally. It then sends the data to db-replica in the background.
The flow looks like this:
- Client sends
INSERT {id: 123, ...}todb-primary. db-primarywrites the data to its local disk.db-primarysends an acknowledgment back to the client.db-primarysends the data todb-replica(eventually).db-replicawrites the data to its local disk (eventually).
Why it matters: This is much faster for the client because db-primary doesn’t wait for db-replica. However, if db-primary crashes after acknowledging the write but before the data has reached db-replica, that data is lost. You’ve accepted the risk of data loss for higher write throughput and lower latency.
Configuration Example (Conceptual - PostgreSQL WAL Shipping):
On db-primary (postgresql.conf):
wal_level = replica
max_wal_senders = 5
synchronous_commit = off # Or 'local' for slightly more safety, but still async relative to the client ack
# synchronous_standby_names is NOT set, or set to ''
On db-replica:
hot_standby = on
Semi-Synchronous Replication (Semi-Sync)
This is a hybrid approach. db-primary writes the data locally and sends it to db-replica. It then waits for db-replica to receive the data (not necessarily apply it, just acknowledge receipt) before acknowledging the write to the client.
The flow:
- Client sends
INSERT {id: 123, ...}todb-primary. db-primarywrites the data to its local disk.db-primarysends the data todb-replica.db-replicareceives the data and sends an acknowledgment of receipt back todb-primary.db-primarysends an acknowledgment back to the client.db-replicaapplies the data to its own storage.
Why it matters: Semi-sync offers a compromise. It provides better durability guarantees than async (data is at least sent to the replica before the client is told "success") but is typically faster than full sync because db-primary doesn’t have to wait for db-replica to apply the data, only to receive it. The exact guarantee depends on the implementation; some might wait for the replica to flush to disk.
Configuration Example (Conceptual - PostgreSQL with pg_replication_slots and specific settings):
On db-primary (postgresql.conf):
wal_level = replica
max_wal_senders = 5
synchronous_commit = remote_write # This is the key setting for semi-sync in some systems
synchronous_standby_names = 'db-replica_name' # Requires configuration to wait for replica ack
On db-replica:
hot_standby = on
# Potentially need to configure how it acknowledges receipt to the primary
The Mental Model: Durability vs. Latency
The core trade-off is always between how quickly a client gets a confirmation (latency) and how likely the data is to survive a failure (durability).
- Sync: High durability, high latency.
- Async: Low durability, low latency.
- Semi-Sync: Medium durability, medium latency.
The "system" isn’t just the databases; it’s also the network. A slow network can make sync replication feel like async from a client’s perspective, even if the data is technically being written everywhere. The "data" itself becomes fluid – in async, the primary’s view is the "truth" for a brief moment, while the replica is lagging.
The critical concept most people miss is that even "synchronous" replication in many systems doesn’t guarantee the data is applied to the replica’s data structures, only that it’s been received and possibly flushed to disk. The primary often waits for a confirmation that the network packet containing the data has been acknowledged by the replica’s network stack, or that the replica has written it to its WAL buffer. The actual INSERT or UPDATE operation on the replica’s tables might still happen asynchronously after that confirmation. This nuance is crucial for understanding potential data loss scenarios even in seemingly robust setups.
The next problem you’ll run into is dealing with network partitions and how different replication modes behave when nodes can’t communicate.