DynamoDB’s capacity units don’t scale linearly with data size, and you’re likely overpaying or hitting throttles because you haven’t accounted for the actual data read or written per operation.
Let’s see it in action. Imagine a simple GetItem operation on a table with a single partition key user_id and a sort key timestamp.
{
"TableName": "MyUsers",
"Key": {
"user_id": {"S": "user123"},
"timestamp": {"N": "1678886400"}
}
}
If this item is 1KB in size, a GetItem operation consuming 1 Read Capacity Unit (RCU) will read exactly 1KB of data. This seems straightforward. But what if you’re using a Query operation to fetch multiple items?
Consider a Query to get all activities for user123 within a specific time range.
{
"TableName": "MyUserActivity",
"KeyConditionExpression": "user_id = :uid AND timestamp BETWEEN :start AND :end",
"ExpressionAttributeValues": {
":uid": {"S": "user123"},
":start": {"N": "1678886400"},
":end": {"N": "1678890000"}
}
}
Suppose this query returns 10 items, each 0.5KB in size, for a total of 5KB of returned data. You might think this costs 5 RCUs (1 RCU per KB). This is where the common misconception bites.
DynamoDB calculates RCUs based on the total size of the items scanned to satisfy the request, not just the data returned. If DynamoDB had to scan 20 items, each 0.5KB, to find those 10 matching items, the cost is 10KB * 1 RCU/KB = 10 RCUs. You get 5KB of data for the price of 10KB.
This is even more pronounced with Scan operations, which, by definition, read every item in the table. A Scan that returns 1MB of data from a table with 100MB of data will cost 100 RCUs, not 1MB.
The fundamental problem DynamoDB solves is providing predictable, single-digit millisecond latency at scale. To do this, it needs to read data in fixed-size chunks (4KB blocks). When you perform an operation, DynamoDB reads the entire 4KB block containing the data you requested, even if you only need a few bytes from it.
-
Read Capacity Units (RCUs):
- Eventually Consistent Reads: 1 RCU reads up to 4KB of data per request. If an item is larger than 4KB, it consumes multiple RCUs. For example, a 7KB item will consume 2 RCUs for an eventually consistent read (4KB + 3KB from the next block).
- Strongly Consistent Reads: 1 RCU reads up to 4KB of data per request, but consumes double the RCUs of an eventually consistent read. A 7KB item will consume 4 RCUs (2 blocks * 2).
-
Write Capacity Units (WCUs):
- 1 WCU writes up to 1KB of data per request. If you write an item larger than 1KB, it consumes multiple WCUs. A 3KB item will consume 3 WCUs.
The Key Levers:
- Item Size: Smaller items mean fewer KB read/written per operation, thus fewer capacity units consumed.
- Read Consistency: Always opt for eventually consistent reads if your application logic allows. The cost is half.
- Query vs. Scan: Use
Querywhenever possible.Queryuses the index to narrow down the search space, reading only relevant partitions and items.Scanreads everything, making it incredibly inefficient and expensive for large tables. LimitinQueryandScan: WhileLimitrestricts the number of items returned, it doesn’t necessarily reduce the amount of data scanned. DynamoDB still reads all the data blocks necessary to find thoseLimititems.- Projection Expressions: Use
ProjectionExpressionto specify exactly which attributes you need. This reduces the amount of data transferred over the network and the amount of data DynamoDB needs to assemble for the response, indirectly affecting the "data read" calculation for the response payload part of the RCU calculation. However, it doesn’t reduce the number of 4KB blocks scanned from disk to find the item.
Consider a PutItem operation for a 5KB item. This will consume 5 WCUs. If you’re writing 100 such items per second, you need 500 WCUs provisioned. If your application logic can batch these writes into a single BatchWriteItem call (up to 25 items per call), you’re still writing 5KB per item, totaling 125KB for the batch. Each item still costs 5 WCUs, so the batch of 25 items costs 125 WCUs. The BatchWriteItem operation itself has a limit of 1MB total payload size and 25 item operations, but the cost is still the sum of individual item write costs.
The most overlooked aspect of RCU calculation is how DynamoDB partitions data. When you perform a Query or Scan, DynamoDB reads data in 1MB chunks (called "pages"). Within each page, it then reads 4KB blocks to find matching items. If a single item spans multiple 4KB blocks, each block is counted. For example, an item that is 6KB in size will consume 2 RCUs for an eventually consistent read because it spans two 4KB blocks. A 10KB item would consume 3 RCUs. This is independent of the item’s attribute values.
The next critical concept is understanding how Query operations actually perform.