Cassandra doesn’t actually cache rows or keys in a way that most databases do; it caches slices of data that are frequently accessed, and the terms "row cache" and "key cache" are historical artifacts that describe what is being cached, not how.

Let’s see this in action. Imagine we have a simple table:

CREATE TABLE users (
    user_id uuid PRIMARY KEY,
    name text,
    email text
);

When you query SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;, Cassandra first needs to find the SSTable containing that row and then locate the specific data within that SSTable.

Here’s a simplified view of what happens without caching:

  1. Client Request: A query arrives for user_id = X.
  2. Coordinator Node: Receives the request and determines which replica node(s) own X.
  3. Replica Node:
    • Partition Key Lookup: Consults its on-disk index (memtable or SSTable index files) to find the offset of the partition containing user_id = X.
    • SSTable Scan: If the data isn’t in the row cache (which we’ll discuss), it might need to read from SSTables. It locates the relevant SSTable(s) and scans them to find the row.
    • Data Retrieval: Extracts the row data.
  4. Response: Sends the row back to the coordinator.

Now, let’s talk about the caches. The terminology is a bit misleading because the caches don’t store individual rows or keys directly in the way you might expect.

Key Cache:

The "key cache" primarily caches the offsets of partitions within SSTables. When a query comes in for user_id = X, Cassandra checks the key cache. If the offset for the partition containing X is found, it significantly speeds up locating the correct SSTable section. It doesn’t store the row data itself.

  • What it caches: Partition index entries (mapping partition keys to SSTable offsets).
  • When it’s useful: For frequently accessed partitions, especially in tables with many small partitions or when reading specific rows repeatedly.
  • Configuration: key_cache_size_in_mb in cassandra.yaml. A typical starting point might be 256 MB, but this is highly dependent on your cluster’s memory and workload.
  • How to check: nodetool info will show Key Cache: size and capacity. nodetool cfstats <keyspace>.<table> will show Key cache hit rate.

Row Cache:

The "row cache" caches the actual row data for frequently accessed rows. If a row is found in the row cache, Cassandra can return it directly without needing to read from disk. This is most effective for tables with a small number of rows that are read very frequently, or for tables where the entire row is relatively small.

  • What it caches: The data for entire rows.
  • When it’s useful: For "hot" rows that are read repeatedly. Think of a configuration table or a frequently accessed user profile. It is less effective for wide rows or tables with many unique rows.
  • Configuration: row_cache_size_in_mb in cassandra.yaml. Again, 256 MB is a common starting point, but it’s workloaddependent.
  • How to check: nodetool info will show Row Cache: size and capacity. nodetool cfstats <keyspace>.<table> will show Row cache hit rate.

The Nuance: It’s Not Just Rows or Keys

The most surprising truth about Cassandra’s caching is that neither the "key cache" nor the "row cache" operates in isolation or strictly on individual keys/rows. Cassandra’s primary on-disk data structure is the SSTable, which is an immutable, sorted file containing rows. The caches are mechanisms to avoid full SSTable reads.

The key cache helps locate which SSTable and which part of that SSTable to look in for a given partition. The row cache, when enabled, holds serialized row data for specific partitions. If a partition is in the row cache, Cassandra can potentially avoid even the SSTable index lookup by serving the data directly.

The system in action: When a read request comes for a specific row user_id = X:

  1. Cassandra checks the row_cache. If the row data for X is present and valid, it’s served directly. This is a cache hit.
  2. If not in the row_cache, it checks the key_cache. If the offset for the partition containing X is present, it uses that offset to quickly find the relevant data within an SSTable. This is a key cache hit.
  3. If neither cache hits, Cassandra performs a full disk read: it consults the SSTable index files to find the partition offset, then reads the data from the SSTable.

Mental Model:

Think of the caches as shortcuts. The key cache is a shortcut to finding the map to the data. The row cache is a shortcut that is the data itself. Both are attempts to avoid the most expensive operation: reading from disk.

The choice between enabling/tuning the key cache versus the row cache depends heavily on your read patterns.

  • Key Cache: Generally more beneficial for read-heavy workloads where you’re accessing many different partitions, but perhaps not the same few rows over and over. It helps reduce disk seeks by quickly pointing to the right location on disk.
  • Row Cache: Best for workloads where a small subset of your data is accessed repeatedly. If you have a few rows that are read thousands of times a second, the row cache can be a massive performance boost. However, it consumes more memory per entry than the key cache and can lead to cache invalidation overhead.

Important Considerations:

  • Memory: Both caches consume heap memory. Over-sizing them can lead to Garbage Collection (GC) issues.
  • Invalidation: When data is updated or deleted, the corresponding entries in the row cache must be invalidated. This adds overhead.
  • Workload: For write-heavy workloads, the benefits of row caching diminish as data changes frequently.
  • Alternatives: Cassandra’s off-heap memory table (for memtables) and Bloom filters (for SSTables) also play crucial roles in reducing disk reads, often making explicit row caching less critical than it might seem. Many modern Cassandra deployments disable row caching entirely and rely on the OS page cache and the key cache.

The one thing most people don’t realize is that the "key cache" and "row cache" are configured per-keyspace and per-table via cassandra.yaml properties like key_cache_size_in_mb and row_cache_size_in_mb. You can specify null for key_cache_save_period and row_cache_save_period in cassandra.yaml to disable saving/loading caches across restarts, and you can set the size to 0 to disable them entirely for a given node. By default, they are enabled and set to 20MB for key_cache and 0MB for row_cache in older versions, and 256MB for key_cache and 0MB for row_cache in newer versions. You can also set these per table using ALTER TABLE ... WITH caching = ...; which overrides the cassandra.yaml settings for that table.

The next thing you’ll likely encounter is understanding how the OS page cache interacts with Cassandra’s on-disk structures, and when to tune one over the other.

Want structured learning?

Take the full Cassandra course →