Cassandra counters don’t actually store a number; they store a delta representing the change since the last time that counter was read or updated.
Let’s see this in action. Imagine we have a simple counter table page_views with a url (partition key) and a count (counter type).
CREATE TABLE page_views (
url text PRIMARY KEY,
count counter
);
If we insert 5 views for /home:
UPDATE page_views SET count = count + 5 WHERE url = '/home';
Cassandra doesn’t write 5 to the count column. Instead, it writes a delta of 5 associated with that specific row and column.
Now, if another update comes in for 10 more views:
UPDATE page_views SET count = count + 10 WHERE url = '/home';
Cassandra writes another delta, this time 10. The row now effectively has two pending updates: +5 and +10.
When we SELECT the count:
SELECT count FROM page_views WHERE url = '/home';
Cassandra reads all the deltas for that row and sums them up. In this case, it would return 15. The crucial part here is that after the SELECT, Cassandra clears all the deltas it just read. It effectively "flushes" the counter. The next SELECT would see 0 if there were no new updates, because the deltas were consumed.
This "read-and-clear" behavior is the core of how counters work, and it’s also the source of their limitations.
The Problem: Inconsistent Reads and Data Loss
The primary issue is that counters are not atomic in the way you might expect. A SELECT operation for a counter consumes the deltas. If a SELECT is happening concurrently with an UPDATE, you might get an inconsistent view of the data. For example, one read might see 15 (after the +5 and +10 updates), while another read happening milliseconds later, before any new updates, might see 0 because the first read consumed the deltas.
This also means you can’t reliably use counters for anything requiring a precise, globally consistent snapshot at any given moment. If you’re trying to get a total count across many different URLs, and a SELECT on one URL happens to consume its deltas just before you read another, your aggregated sum will be wrong.
The Real-World Use Case: Simple, High-Volume Increments
Counters are best suited for scenarios where you need to track very high-volume, append-only increments, and a small degree of eventual consistency is acceptable. Think of things like:
- Page View Counts: Knowing that a page has been viewed "around 10,000 times" is often sufficient.
- Like/Dislike Counts: Similar to page views, the exact number at any given millisecond isn’t critical.
- Simple Event Tracking: Counting how many times a specific event has occurred.
The Mechanics of Counter Updates
When you issue an UPDATE ... SET count = count + N, Cassandra doesn’t lock the row. It writes a new counter update record. These records are stored in a special "counter update" structure within the commit log and memtable, and eventually flushed to SSTables. During a read, Cassandra traverses these structures to sum up all the applicable deltas for the requested row.
When to Absolutely Avoid Counters
-
When you need atomic, consistent reads: If your application logic depends on reading an exact, consistent value that won’t change unexpectedly between reads or be affected by concurrent updates, counters are a bad fit. You’ll need to use a regular
countercolumn with a uniqueidand aggregate manually, or use a materialized view for more complex aggregations. -
When you need to perform operations other than increment/decrement: You cannot
SETa counter to an arbitrary value, nor can you perform arithmetic operations beyond simple addition or subtraction. You can’t doSET count = count * 2orSET count = 0. -
When you need to read and then update the same counter in a single transaction: Because
SELECT countconsumes deltas, trying toSELECT countand then immediatelyUPDATE count = count + 1will lead to race conditions. TheSELECTmight returnN, and then theUPDATEadds1to thatN, but if another update for+1happened between yourSELECTandUPDATE, you’ve effectively lost an increment. -
When you need to delete a counter: You cannot directly
DELETEa counter column. Deleting the row itself will remove the counter, but there’s no way to zero it out or remove just the counter value while keeping other data in the row. -
When you need to aggregate counters across multiple partitions in a single query: A
SUM(count)in aSELECTstatement will only sum the current deltas for the rows returned by the query. It does not account for deltas that have already been consumed by other reads, nor does it handle summing across partitions reliably in a single go.
The One Thing Most People Don’t Know
The underlying representation of a counter value is a set of multiple counter cells, each representing a delta. When you perform a SELECT, Cassandra reads all these deltas and then removes them. This is why a SELECT on a counter is not idempotent; subsequent reads without intervening writes will return 0 until new updates occur. This consumption mechanism is what makes counters fast for writes but tricky for reads.
The Next Hurdle: Handling Counter Deletes and Resets
If you find yourself needing to "reset" a counter or delete it without deleting the whole row, you’ll discover that Cassandra counters don’t directly support this. You’ll need to implement workarounds, often involving time-to-live (TTL) on counter updates or more complex data modeling.