DynamoDB’s auto-scaling feature is actually a misnomer; it doesn’t directly adjust your provisioned capacity based on traffic. Instead, it orchestrates a separate AWS service to do the heavy lifting.

Let’s watch this in action. Imagine a users table with a steadily increasing number of read requests.

{
  "TableName": "users",
  "AttributeDefinitions": [
    {"AttributeName": "userId", "AttributeType": "S"}
  ],
  "KeySchema": [
    {"AttributeName": "userId", "KeyType": "HASH"}
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 5,
    "WriteCapacityUnits": 5
  },
  "BillingMode": "PROVISIONED"
}

Now, we’ll set up an Application Auto Scaling policy. This policy defines the target utilization for our read and write capacity.

{
  "PolicyName": "my-users-table-read-auto-scale",
  "ServiceNamespace": "dynamodb",
  "ScalableDimension": "dynamodb:table:ReadCapacityUnits",
  "ResourceIds": ["table/users"],
  "ScalableTarget": {
    "MinCapacity": 2,
    "MaxCapacity": 100,
    "RoleArn": "arn:aws:iam::123456789012:role/aws-service-role/dynamodb.application-autoscaling.amazonaws.com/AWSServiceRoleForDynamoDBAutoScaling"
  },
  "TargetTrackingScalingPolicyConfiguration": {
    "TargetValue": 70.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 300,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "DynamoDBReadCapacityUtilization"
    }
  }
}

The magic happens when CloudWatch metrics for ConsumedReadCapacityUnits start climbing. If the utilization (consumed units / provisioned units) exceeds our TargetValue (70% in this case) for a sustained period, Application Auto Scaling will step in. It doesn’t just instantly bump the provisioned capacity; it follows a predictable ramp-up. For example, if your table is consistently hitting 80% read utilization, Application Auto Scaling will increase the provisioned read capacity units. It might jump from 5 to 10, then 20, and so on, aiming to bring that utilization back down towards 70%. Conversely, if utilization drops below 70% for the ScaleInCooldown period (300 seconds), it will scale down.

The core problem this solves is avoiding throttling (HTTP 400 errors) during traffic spikes and preventing overspending on provisioned capacity during lulls. The internal mechanism relies on CloudWatch Alarms. Application Auto Scaling creates these alarms behind the scenes, monitoring the DynamoDBReadCapacityUtilization and DynamoDBWriteCapacityUtilization metrics. When an alarm threshold is breached, it triggers an action: a scaling operation.

The ScalableDimension is crucial here. For DynamoDB tables, it’s dynamodb:table:ReadCapacityUnits or dynamodb:table:WriteCapacityUnits. For DynamoDB global secondary indexes (GSIs), it’s dynamodb:index:ReadCapacityUnits. You need to specify the correct dimension for the resource you want to scale.

The RoleArn points to the IAM role that Application Auto Scaling assumes to perform actions on your behalf. This role needs dynamodb:UpdateTable permissions.

Crucially, auto-scaling for DynamoDB is not about predicting future traffic. It’s a reactive system that responds to current utilization. This means there’s always a slight lag. If you have sudden, massive, short-lived spikes, you might still experience some throttling before auto-scaling can catch up. For such workloads, on-demand capacity mode is a better fit.

Most people don’t realize that Application Auto Scaling also manages scaling for DynamoDB On-Demand backups. If you enable this, it will automatically adjust the provisioned throughput for your backup tasks to ensure they complete within their allocated window, without impacting your table’s regular performance.

The next step in managing DynamoDB performance is understanding how provisioned throughput interacts with item size and partition key design.

Want structured learning?

Take the full Dynamodb course →