DynamoDB Provisioned vs On-Demand Capacity: Pick the Right Mode (2026)

DynamoDB’s capacity modes aren’t just about how you pay; they fundamentally change how your application interacts with the database at a system level.

Let’s see this in action. Imagine a simple put_item operation.

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-example-table')

response = table.put_item(
    Item={
        'id': 'user-123',
        'data': 'some value'
    }
)
print(response)

In Provisioned capacity mode, this put_item request is attempting to consume a specific amount of write capacity units (WCUs). The system’s immediate concern is whether your table’s allocated WCU is greater than or equal to the consumed WCUs for this request. If it’s not, you get a ProvisionedThroughputExceededException.

In On-Demand mode, there’s no pre-allocated capacity to check. The system’s concern shifts to its ability to quickly provision the necessary resources to handle the sudden spike in traffic. It’s less about hitting a hard limit and more about the elasticity of the underlying infrastructure. If the system can’t spin up resources fast enough to meet the demand, you’ll likely see latency spikes, and eventually, ProvisionedThroughputExceededException can still occur, though it’s a signal of sustained, high traffic rather than a simple over-allocation.

The core problem these modes solve is managing the cost and performance trade-offs of unpredictable database traffic. Provisioned capacity requires you to forecast your workload and set specific read and write throughput limits. On-Demand capacity, conversely, automatically scales throughput based on actual traffic, abstracting away the forecasting burden.

Internally, Provisioned mode is like having a reserved lane on a highway. You pay for that lane whether you use it or not. When a request comes in, the system checks if your lane has space. If it does, the request goes through. If not, it waits or errors. The key levers you control are ReadCapacityUnits and WriteCapacityUnits when creating or updating your table.

On-Demand mode is more like a toll road where you pay per car. The road dynamically expands or contracts based on the number of cars passing through. The system is constantly monitoring traffic flow and adjusting its capacity. You don’t directly set ReadCapacityUnits or WriteCapacityUnits; instead, you enable BillingMode='PAY_PER_REQUEST'. The system handles the scaling behind the scenes.

The critical distinction lies in how the system handles bursts. In Provisioned mode, a burst of traffic exceeding your allocated capacity results in throttling. You can mitigate this with Auto Scaling, which adjusts your provisioned capacity based on metrics like ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits. For example, you might configure Auto Scaling to increase WCUs by 10% if ConsumedWriteCapacityUnits is 70% of the provisioned capacity for 5 minutes.

On-Demand mode, while seemingly simpler, has its own nuances. While it auto-scales, it’s not instantaneous. There’s a ramp-up period. If your application experiences a sudden, massive surge in traffic (e.g., a viral marketing campaign), On-Demand capacity might struggle to keep up immediately, leading to increased latency or temporary throttling. DynamoDB states that On-Demand capacity can handle up to twice the previous peak traffic for the preceding 30 minutes. This means if your traffic peaked at 1000 RCUs, it can handle up to 2000 RCUs, but it needs that 30-minute window to establish that "previous peak."

The most surprising truth is that even in On-Demand mode, while you don’t set explicit limits, you still have effective limits defined by the system’s ability to provision resources and the 30-minute historical peak. If you have a spiky, unpredictable workload, On-Demand is often the default choice. However, if you have a predictable workload that consistently sits near a certain throughput level, Provisioned mode with Auto Scaling can be significantly more cost-effective. You’re essentially paying for guaranteed capacity in Provisioned mode, whereas On-Demand charges a premium for the flexibility and automatic scaling. The cost difference can be substantial, with On-Demand typically being 25% more expensive than equivalent Provisioned capacity.

The next concept to explore is how to monitor your capacity usage effectively in both modes to make informed decisions about when to switch or adjust.