The surprising truth about event stores is that they aren’t just databases of events; they are the single source of truth that defines the state of your system.

Let’s see it in action. Imagine we’re building a simple order processing system. Our events might look like this:

{
  "eventType": "OrderCreated",
  "orderId": "a1b2c3d4",
  "customerId": "user123",
  "timestamp": "2023-10-27T10:00:00Z"
}
{
  "eventType": "OrderItemAdded",
  "orderId": "a1b2c3d4",
  "itemId": "widget-x",
  "quantity": 2,
  "price": 15.99,
  "timestamp": "2023-10-27T10:01:30Z"
}
{
  "eventType": "OrderShipped",
  "orderId": "a1b2c3d4",
  "shippingAddress": "123 Main St, Anytown, USA",
  "timestamp": "2023-10-27T11:30:00Z"
}

When an OrderCreated event is appended to the event store for orderId: a1b2c3d4, an "order" aggregate is born. When OrderItemAdded follows, the aggregate’s internal state updates to reflect the new item. Finally, OrderShipped transitions the order to a shipped state. The current state of orderId: a1b2c3d4 isn’t stored anywhere directly; it’s derived by replaying all its events from the store. This is the core of event sourcing.

For CQRS (Command Query Responsibility Segregation), the event store serves as the write side. Commands come in (e.g., "Add Item to Order"), are validated against the current state (derived from the event store), and if valid, result in new events being appended. The read side (queries) subscribes to these events, projecting them into optimized read models (e.g., a SQL table of current orders, a search index of shipped items). This separation allows independent scaling and optimization of reads and writes.

The key components you control are:

  1. Event Store Technology: Options range from specialized event stores like EventStoreDB or Axon Server to using a relational database (e.g., PostgreSQL) with specific schema designs to append events as rows and a mechanism for atomic appends. For a simple start, PostgreSQL with a table like events (aggregate_type VARCHAR(255), aggregate_id VARCHAR(255), event_type VARCHAR(255), event_data JSONB, sequence_number BIGINT, timestamp TIMESTAMPTZ DEFAULT NOW(), PRIMARY KEY (aggregate_type, aggregate_id, sequence_number)) is viable, though it requires careful implementation of optimistic concurrency.
  2. Event Serialization: How you store your event data. JSON is common and human-readable. Protobuf or Avro offer schema evolution and performance benefits. The choice impacts how easily you can evolve your events over time.
  3. Event Stream Identification: How you group events. Typically, this is by aggregate type and aggregate ID (e.g., all events for Order with orderId: a1b2c3d4 form a single stream).
  4. Concurrency Control: Crucial for ensuring that multiple commands trying to modify the same aggregate don’t lead to data corruption. This is usually handled via optimistic concurrency: when appending new events, you check that the last known sequence number matches what you expect. If it doesn’t, the append fails, and the command handler needs to re-fetch the current state and retry.
  5. Projections: The process of transforming the event stream into queryable read models. This is where you build your denormalized views for efficient querying. You’ll need mechanisms to ensure projections are kept up-to-date with the event store, often through message queues or direct subscriptions.

The ability to "go back in time" by replaying events is not just a debugging tool; it’s how you can build entirely new read models for data you never anticipated needing, without needing to alter your write-side logic or touch your historical data.

When appending events, especially in a distributed system, you need to ensure atomicity not just for a single event, but for a sequence of events representing a single command’s outcome. This often involves transaction-like behavior within the event store or careful handling of retries and idempotency on the client side.

Want structured learning?

Take the full Cqrs course →