An Auto Scaling Group is less about "scaling" and more about "resilience against failure."
Let’s get one running. We’ll need a couple of things first: an EC2 Launch Template and a VPC with at least two Availability Zones.
1. Create an EC2 Launch Template
This defines what your instances will look like when they spin up.
aws ec2 create-launch-template --launch-template-name my-app-launch-template --version-description "Initial version for my app" --instance-type t3.micro --image-id ami-0abcdef1234567890 --key-name my-ssh-key --security-group-ids sg-0123456789abcdef0 --subnet-id subnet-0123456789abcdef0
--launch-template-name: A human-readable name for your template.--version-description: A helpful note about this version.--instance-type: The size of your instances.t3.microis good for testing.--image-id: The Amazon Machine Image (AMI) to use. Replaceami-0abcdef1234567890with a valid AMI ID for your region (e.g., an Amazon Linux 2 AMI). You can find these in the EC2 console.--key-name: The SSH key pair you’ll use to connect to your instances.--security-group-ids: The security group(s) that will be applied to your instances. Make sure this allows SSH (port 22) if you want to connect.--subnet-id: The subnet where your instances will launch. This is a placeholder for now; the Auto Scaling Group will handle spreading across subnets.
2. Create an Auto Scaling Group
Now, we’ll tie the Launch Template to the Auto Scaling Group.
aws autoscaling create-auto-scaling-group --auto-scaling-group-name my-app-asg --launch-template LaunchTemplateName=my-app-launch-template,Version=1 --min-size 1 --max-size 3 --desired-capacity 2 --vpc-zone-identifier "subnet-0123456789abcdef0,subnet-fedcba9876543210"
--auto-scaling-group-name: A name for your ASG.--launch-template: Specifies the Launch Template to use.Version=1refers to the initial version we created.--min-size: The minimum number of instances to keep running. If an instance dies, the ASG will launch a new one to meet this minimum.--max-size: The absolute maximum number of instances the ASG can launch.--desired-capacity: The number of instances the ASG aims to have running at any given time. If you change this value, the ASG will launch or terminate instances to match.--vpc-zone-identifier: This is crucial for high availability. Provide a comma-separated list of subnet IDs from different Availability Zones. The ASG will distribute instances across these subnets. Replacesubnet-0123456789abcdef0andsubnet-fedcba9876543210with actual subnet IDs from your VPC.
Seeing it in Action
Once you run that create-auto-scaling-group command, AWS will start provisioning instances. Go to the EC2 console, and you’ll see your instances launching. They’ll have names like i-0123456789abcdef0 (the instance ID) and will be tagged by the ASG.
If you manually terminate one of these instances from the EC2 console, you’ll see the ASG automatically launch a replacement within a minute or two to bring the count back up to your desired-capacity. This is the resilience kicking in.
The ASG is also smart about Availability Zones. If an AZ becomes unhealthy, the ASG will try to launch new instances in healthy AZs to maintain the desired capacity.
The Launch Template defines the what (AMI, instance type, security), and the Auto Scaling Group defines the how many and where. It’s this combination that provides automatic recovery from instance failures and allows you to easily scale up or down based on demand.
You’ll often want to attach a Load Balancer to an ASG. When you do, the ASG will automatically register new instances with the load balancer and deregister instances that are being terminated. This is how you ensure traffic is always directed to healthy instances.
The real power of an Auto Scaling Group isn’t just replacing failed instances; it’s also about making your application robust. By spreading instances across multiple Availability Zones, you eliminate single points of failure at the datacenter level. When you combine this with a load balancer, you get an application that can withstand hardware failures and even datacenter outages without user impact.
The ASG doesn’t inherently know if your application is healthy, only if the EC2 instance is running. To scale based on actual application load (like CPU utilization or request count), you’ll need to configure Scaling Policies, which are the next logical step.