EC2 Step Scaling policies allow you to fine-tune your Auto Scaling group’s response to changing load by applying different scaling adjustments based on the magnitude of the CloudWatch alarm breach.

Let’s see this in action. Imagine we have a web application running on EC2 instances behind an Application Load Balancer. We want to scale out when CPU utilization exceeds 60%, but we want to be more aggressive if it spikes above 80%.

Here’s a sample CloudWatch alarm configuration that triggers a step scaling policy:

{
  "AlarmName": "HighCPUAlarm",
  "AlarmDescription": "Scale out when CPU utilization is high",
  "ActionsEnabled": true,
  "OKActions": [],
  "AlarmActions": [
    "arn:aws:autoscaling:us-east-1:123456789012:scalingPolicy:a1b2c3d4-e5f6-7890-1234-567890abcdef:autoScalingGroupName/MyWebAppASG:policyName/MyStepScalingPolicy"
  ],
  "InsufficientDataActions": [],
  "MetricName": "CPUUtilization",
  "Namespace": "AWS/EC2",
  "Statistic": "Average",
  "Period": 300,
  "EvaluationPeriods": 2,
  "DatapointsToAlarm": 2,
  "Threshold": 60.0,
  "ComparisonOperator": "GreaterThanThreshold",
  "TreatMissingData": "notBreaching",
  "Dimensions": [
    {
      "Name": "AutoScalingGroupName",
      "Value": "MyWebAppASG"
    }
  ],
  "Unit": "Percent"
}

This alarm, HighCPUAlarm, will fire if the average CPU utilization across the MyWebAppASG group is above 60% for two consecutive 5-minute periods (10 minutes total). When it fires, it will trigger the MyStepScalingPolicy.

Now, let’s define the MyStepScalingPolicy itself. This is where the "step" comes in. We’ll use the AWS CLI for this, as it’s a common way to manage these resources.

aws autoscaling put-scaling-policy --auto-scaling-group-name MyWebAppASG --policy-name MyStepScalingPolicy --policy-type StepScaling --adjustment-type ChangeInCapacity --cooldown 300 --metric-alarm-json '{
    "ComparisonOperator": "GreaterThanOrEqualToThreshold",
    "EvaluationPeriods": 1,
    "MetricName": "CPUUtilization",
    "Namespace": "AWS/EC2",
    "Period": 60,
    "Statistic": "Average",
    "Threshold": 60.0,
    "AlarmActions": [
        "arn:aws:cloudwatch:us-east-1:123456789012:alarm:HighCPUAlarm"
    ],
    "Dimensions": [
        {
            "Name": "AutoScalingGroupName",
            "Value": "MyWebAppASG"
        }
    ]
}' --step-scaling-adjustments '[
    {"MetricIntervalLowerBound": 0.0, "ScalingAdjustment": 1},
    {"MetricIntervalUpperBound": 0.0, "ScalingAdjustment": 1},
    {"MetricIntervalLowerBound": 10.0, "ScalingAdjustment": 2},
    {"MetricIntervalUpperBound": 10.0, "ScalingAdjustment": 2},
    {"MetricIntervalLowerBound": 20.0, "ScalingAdjustment": 3},
    {"MetricIntervalUpperBound": 20.0, "ScalingAdjustment": 3}
]'

This command creates a step scaling policy named MyStepScalingPolicy for the MyWebAppASG Auto Scaling group.

  • --policy-type StepScaling: This explicitly tells AWS we’re configuring a step scaling policy.
  • --adjustment-type ChangeInCapacity: This means each step will add or remove a specific number of instances. We could also use PercentChangeInCapacity to scale by a percentage.
  • --cooldown 300: After a scaling activity occurs, Auto Scaling waits 300 seconds (5 minutes) before it can perform another scaling activity. This prevents rapid fluctuations.
  • --metric-alarm-json: This section defines the CloudWatch alarm that will trigger this policy. Notice it’s slightly different from the HighCPUAlarm defined earlier; this is the triggering alarm for the policy, while the HighCPUAlarm is the action alarm that calls the policy. For simplicity, we’re using a single alarm here that triggers the policy. A more robust setup might have separate alarms.
  • --step-scaling-adjustments: This is the core of step scaling. It’s an array of objects, each defining a step.

Let’s break down the step-scaling-adjustments array:

  • {"MetricIntervalLowerBound": 0.0, "ScalingAdjustment": 1} and {"MetricIntervalUpperBound": 0.0, "ScalingAdjustment": 1}: These define the first step. When the CPU utilization metric is between 0% and 10% (inclusive of 0, exclusive of 10), add 1 instance. Note that the MetricIntervalUpperBound of 0.0 is a bit counterintuitive; it means "up to, but not including, the next threshold." The system interprets this as "0.0 to less than 10.0".
  • {"MetricIntervalLowerBound": 10.0, "ScalingAdjustment": 2} and {"MetricIntervalUpperBound": 10.0, "ScalingAdjustment": 2}: When the CPU utilization is between 10% and 20% (inclusive of 10, exclusive of 20), add 2 instances.
  • {"MetricIntervalLowerBound": 20.0, "ScalingAdjustment": 3} and {"MetricIntervalUpperBound": 20.0, "ScalingAdjustment": 3}: When the CPU utilization is 20% or higher, add 3 instances.

The MetricIntervalUpperBound for a given step is effectively the MetricIntervalLowerBound for the next step. The MetricIntervalLowerBound of 0.0 and MetricIntervalUpperBound of 0.0 are a bit of a quirk. The system interprets the bounds like this: [MetricIntervalLowerBound, MetricIntervalUpperBound). So, [0.0, 10.0) means 0.0 up to, but not including, 10.0. The MetricIntervalUpperBound of 0.0 for the first step actually means the upper bound is the next MetricIntervalLowerBound which is 10.0. The logic is that if the metric is less than the lower bound of the next step, it falls into the current step.

This configuration means:

  • If CPU goes from a normal state to 5% (which is less than 10%), add 1 instance.
  • If CPU then climbs to 15% (which is >= 10% and < 20%), add 2 instances.
  • If CPU then spikes to 25% (which is >= 20%), add 3 instances.

The key here is that the ScalingAdjustment value is added to the current capacity. If the Auto Scaling group has 2 instances and CPU hits 25%, it will add 3 instances, bringing the total to 5. If CPU then drops to 15%, it will add 2 instances, bringing it to 7.

The most surprising thing about step scaling is how the MetricIntervalUpperBound and MetricIntervalLowerBound interact to define ranges. Each pair of bounds specifies a range relative to the metric that triggered the alarm. If the alarm metric is CPUUtilization and the threshold is 60%, the step scaling bounds are applied to the value of the CPUUtilization metric itself, not to how much it breached the threshold. So, if CPUUtilization is 70%, that’s a MetricIntervalLowerBound of 20.0 (since 70 is >= 20), and it would trigger an adjustment of 3 instances.

If you also want to scale in, you’d configure a separate scaling policy for a LowCPUAlarm (e.g., when CPU drops below 30%) with different step adjustments, perhaps removing instances more aggressively at lower utilization levels.

The next concept you’ll likely encounter is Target Tracking Scaling policies, which aim to maintain a specific metric value rather than reacting to breaches.

Want structured learning?

Take the full Ec2 course →