Rolling updates in CloudFormation for EC2 Auto Scaling Groups (ASGs) are a bit of a trick, and the surprising truth is that CloudFormation doesn’t actually do the rolling update itself for ASGs. It orchestrates the trigger for the ASG’s own rolling update mechanism.
Let’s see this in action. Imagine you have a CloudFormation stack defining an ASG.
Resources:
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
ImageId: ami-0abcdef1234567890
InstanceType: t2.micro
KeyName: my-key-pair
MyASG:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
MinSize: 1
MaxSize: 3
DesiredCapacity: 2
VPCZoneIdentifier:
- subnet-0123456789abcdef0
- subnet-fedcba9876543210
LaunchConfigurationName: !Ref MyLaunchConfig # Or LaunchTemplate
Tags:
- Key: Name
Value: MyWebAppInstance
MyLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: ami-0abcdef1234567890
InstanceType: t2.micro
# Other properties like SecurityGroups, UserData, etc.
Now, you want to update the InstanceType for MyLaunchConfig to t3.micro.
MyLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: ami-0abcdef1234567890
InstanceType: t3.micro # Changed!
# Other properties
When you deploy this change via CloudFormation (e.g., aws cloudformation deploy --stack-name my-stack --template-file template.yaml --capabilities CAPABILITY_IAM), CloudFormation detects that MyLaunchConfig has changed. Since MyASG depends on MyLaunchConfig (via LaunchConfigurationName), CloudFormation knows it needs to update MyASG.
Here’s where the ASG’s own behavior kicks in. The ASG, upon detecting a change in its launch configuration or launch template, initiates a rolling update. It doesn’t wait for CloudFormation to tell it how to update. It follows its pre-defined update policy.
The mental model here is that CloudFormation is the conductor, and the ASG is the orchestra. CloudFormation signals a change in the score (the launch configuration/template), and the orchestra (ASG) plays its part according to its own established rules (the update policy).
By default, ASG rolling updates are configured to replace instances one by one. This means it will:
- Launch a new instance with the updated configuration.
- Wait for the new instance to become healthy (based on health checks you configure, like ELB health checks or EC2 status checks).
- Terminate an old instance.
- Repeat until all old instances are replaced.
You can control this behavior via the UpdatePolicy attribute on the AWS::AutoScaling::AutoScalingGroup resource in your CloudFormation template.
MyASG:
Type: AWS::AutoScaling::AutoScalingGroup
UpdatePolicy:
AutoScalingRollingUpdate:
MinInstancesInService: 1 # Minimum number of instances that must remain in service
MaxBatchSize: 1 # Number of instances to update at a time
PauseTime: "PT5M" # Pause for 5 minutes between batches (ISO 8601 duration)
# SuspendProcessTypes: # Optional: suspend processes like Terminate, HealthCheck, etc.
# - Terminate
# - HealthCheck
Properties:
# ... other ASG properties
The MinInstancesInService ensures you don’t go below a certain availability threshold during the update. MaxBatchSize dictates how many instances are updated concurrently. PauseTime adds a delay, giving you a window to catch any immediate issues before the next batch is updated. Suspending processes like Terminate can be crucial if you have custom termination logic or want to manually intervene.
One crucial aspect often overlooked is the interaction with Elastic Load Balancers (ELBs). For a smooth rolling update, your ASG must be configured to register new instances with the ELB and deregister old instances. This is typically handled automatically when the ASG health checks are tied to the ELB’s health checks. If a new instance passes ELB health checks, it’s considered healthy. When an old instance is terminated, the ASG signals the ELB to deregister it, allowing in-flight requests to complete before the instance is fully shut down.
If you’re using launch templates instead of launch configurations, the process is identical from CloudFormation’s perspective. You update the launch template resource, and CloudFormation detects the change. The ASG then uses the new version of the launch template for its rolling update.
The next concept you’ll likely grapple with is managing zero-downtime deployments more robustly, especially with stateful applications or when you need finer control over traffic shifting.