Blue-green deployments are a way to minimize downtime and risk when releasing new versions of your application.
Imagine you have a live, working version of your application, let’s call it "Blue." When you’re ready to deploy a new version, you spin up a second, identical environment, but with the new code. This new environment is "Green."
Here’s a quick look at what that might look like in ECS. We’ll use CloudFormation to manage our infrastructure.
AWSTemplateFormatVersion: '2010-09-09'
Description: Example Blue-Green ECS Deployment
Parameters:
LatestImageURI:
Type: String
Description: The ECS Task Definition image URI for the new version (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest)
Resources:
# ... (VPC, Subnets, Security Groups, IAM Roles) ...
# Define the ALB and Target Groups
ApplicationLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: my-app-alb
Scheme: internet-facing
Type: application
IpAddressType: ipv4
Subnets:
- subnet-xxxxxxxxxxxxxxxxx
- subnet-yyyyyyyyyyyyyyyyy
SecurityGroups:
- sg-zzzzzzzzzzzzzzzzzzz
Listener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref TargetGroupBlue
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 80
Protocol: HTTP
TargetGroupBlue:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: my-app-tg-blue
Port: 80
Protocol: HTTP
VpcId: vpc-aaaaaaaaaaaaaaaaa
HealthCheckPath: /health
TargetType: ip
TargetGroupGreen:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: my-app-tg-green
Port: 80
Protocol: HTTP
VpcId: vpc-aaaaaaaaaaaaaaaaa
HealthCheckPath: /health
TargetType: ip
# Define the ECS Cluster and Services
EcsCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: my-app-cluster
TaskDefinitionBlue:
Type: AWS::ECS::TaskDefinition
Properties:
Family: my-app-task
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
Cpu: '256'
Memory: '512'
ExecutionRoleArn: !Ref EcsTaskExecutionRole # Assume this role is defined elsewhere
ContainerDefinitions:
- Name: my-app-container
Image: !Ref LatestImageURI # This will be updated for Green
PortMappings:
- ContainerPort: 80
Protocol: tcp
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: /ecs/my-app
awslogs-stream-prefix: ecs
ServiceBlue:
Type: AWS::ECS::Service
Properties:
Cluster: !Ref EcsCluster
ServiceName: my-app-service-blue
TaskDefinition: !Ref TaskDefinitionBlue
DesiredCount: 2
LaunchType: FARGATE
NetworkConfiguration:
AwsvpcConfiguration:
Subnets:
- subnet-xxxxxxxxxxxxxxxxx
- subnet-yyyyyyyyyyyyyyyyy
SecurityGroups:
- sg-zzzzzzzzzzzzzzzzzzz
LoadBalancers:
- TargetGroupArn: !Ref TargetGroupBlue
ContainerName: my-app-container
ContainerPort: 80
# The Green Task Definition and Service will be very similar,
# but will use a *different* Task Definition ARN and Target Group.
# This is typically managed by a separate CloudFormation stack or CodePipeline.
Outputs:
LoadBalancerDNS:
Description: The DNS name of the Application Load Balancer
Value: !GetAtt ApplicationLoadBalancer.DNSName
The core idea is to have two identical sets of ECS tasks and two corresponding target groups for your Application Load Balancer (ALB). One target group (TargetGroupBlue) points to your current production tasks (the "Blue" environment), and the other (TargetGroupGreen) points to the new tasks you’re deploying (the "Green" environment).
When you’re ready to deploy, you first launch the new tasks using the new container image. These new tasks register with TargetGroupGreen. Your ALB is configured to send traffic only to TargetGroupBlue at this stage.
Once the "Green" tasks are healthy and ready (you can check this via health checks configured on the TargetGroupGreen and by looking at ECS service logs), you initiate the switch. This is the critical step. Instead of updating the running ECS service directly, you update the ALB listener’s default action. You tell the ALB to now forward traffic to TargetGroupGreen instead of TargetGroupBlue.
# Example of updating the listener to point to Green
ListenerSwitch:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref TargetGroupGreen # Switched to Green
LoadBalancerArn: !Ref ApplicationLoadBalancer
Port: 80
Protocol: HTTP
This switch is atomic from the perspective of the ALB. It redirects traffic instantaneously. Because the "Green" tasks have already been running and warmed up, users experience no downtime. They are seamlessly routed to the new version.
After the switch, you monitor your application. If everything looks good, you can eventually scale down the "Blue" tasks and update your CloudFormation to make TargetGroupBlue point to the new version (which is now the "Blue" version) for the next deployment. If something goes wrong, you can quickly revert by changing the ALB listener’s default action back to TargetGroupBlue.
The most surprising thing about this pattern is how simple the actual "cutover" can be, provided you have the right infrastructure in place. It’s not about stopping and starting services; it’s about reconfiguring a load balancer.
The key levers you control are:
- Container Image: This is what changes between Blue and Green. You update the
Imageproperty in your ECS Task Definition. - Target Groups: You need two distinct target groups, one for Blue and one for Green.
- ALB Listener: This is where the magic happens. You update the
DefaultActionsto point to either the Blue or Green target group. - ECS Service: You’ll likely have two ECS services, one pointing to the Blue target group and one to the Green, or you might update a single service’s target group association. For full isolation, two services are common.
- Health Checks: Crucial for knowing when Green is ready to receive traffic.
A common, subtle point people miss is how to manage the task definitions and services across the two environments. Often, you’ll have a "Blue" CloudFormation stack that defines the current production environment and a separate, parameterized "Green" stack or a CodePipeline stage that builds the new environment. The act of switching traffic is typically a separate, often manual or automated, step that reconfigures the ALB listener after the Green ECS service is deemed healthy. You don’t usually update the same ECS service definition to point to the new image; you might spin up a new service for Green or update the ALB listener’s target group association directly.
The next step after mastering this is to automate the rollback process based on metrics.