ECS task placement is surprisingly fluid, often making decisions for you that can feel like magic, but it’s actually a complex dance of placement strategies and constraints designed to keep your services healthy and performant.
Let’s see this in action. Imagine you have a web service running on ECS. You want to ensure that if one Availability Zone (AZ) goes down, your service remains available. We’ll configure a placement strategy to spread tasks across AZs.
Here’s a simplified example of a task definition and a service definition.
Task Definition (simplified):
{
"family": "my-web-app",
"containerDefinitions": [
{
"name": "web",
"image": "nginx:latest",
"portMappings": [
{
"containerPort": 80,
"hostPort": 80
}
]
}
]
}
Service Definition with Placement Strategy:
{
"cluster": "my-cluster",
"serviceName": "my-web-service",
"taskDefinition": "my-web-app:1",
"desiredCount": 3,
"launchType": "EC2",
"placementStrategy": [
{
"type": "spread",
"field": "attribute:ecs.availability-zone"
}
]
}
When you create this service, ECS doesn’t just randomly launch your 3 tasks. It consults the placementStrategy. The spread strategy, with field: "attribute:ecs.availability-zone", tells ECS to distribute tasks as evenly as possible across the available AZs within your cluster’s VPC. If your cluster spans us-east-1a, us-east-1b, and us-east-1c, ECS will aim to place one task in each AZ.
The problem this solves is resilience. Without explicit placement, all your tasks could theoretically land in the same AZ. If that AZ experiences an outage, your service goes down. By using spread on ecs.availability-zone, you guarantee a degree of fault tolerance.
Internally, ECS uses the Cluster Capacity Provider to understand the resources available in your cluster. For EC2 launch types, this means looking at the EC2 instances registered to the cluster and their associated attributes, like the AZ they reside in. For Fargate, it’s more abstract, as AWS manages the underlying infrastructure.
You can control task placement using two primary mechanisms: Placement Strategies and Placement Constraints.
Placement Strategies are about how to distribute tasks. They influence the initial placement of tasks.
spread: Distributes tasks evenly across values of a specified attribute. We saw this withattribute:ecs.availability-zone. You could also spread acrossattribute:ecs.instance-typeto mix instance types for cost or performance reasons.binpack: Packs tasks onto fewer instances based on a specified attribute (e.g.,cpuormemory). This can be useful for optimizing resource utilization and reducing costs by consolidating tasks onto fewer, potentially larger, instances.random: Places tasks randomly. Simple, but offers no guarantees.
Placement Constraints are about where tasks can or cannot be placed. They act as filters.
memberOf: Constrains tasks to instances that have a specific tag. For example,{"expression": "tag(ecs.instance-type, \"t3.medium\")"}would only allow tasks to be placed ont3.mediuminstances.distinct: Ensures that tasks have distinct values for a specified attribute. For example,{"expression": "distinct(attribute:ecs.availability-zone)"}is implicitly handled by thespreadstrategy for AZs, but you could use it for other attributes.attribute: Constrains tasks to instances that have a specific attribute. For example,{"expression": "attribute:ecs.cpu-cores == 4"}would only place tasks on instances with 4 CPU cores.
Consider a scenario where you have a mix of GPU and CPU-only instances in your cluster. You might have a machine learning inference service that requires a GPU.
Service Definition with Placement Constraint:
{
"cluster": "my-cluster",
"serviceName": "my-gpu-service",
"taskDefinition": "my-gpu-app:1",
"desiredCount": 2,
"launchType": "EC2",
"placementConstraints": [
{
"type": "memberOf",
"expression": "tag(ecs.gpu-type, \"nvidia-tesla-v100\")"
}
]
}
In this case, the memberOf constraint with the expression tag(ecs.gpu-type, "nvidia-tesla-v100") ensures that the tasks for my-gpu-service will only be launched on EC2 instances that have been tagged with ecs.gpu-type and the value nvidia-tesla-v100. ECS will search your cluster for instances matching this tag. If no such instances are found, the service will be unable to launch tasks.
The most surprising aspect of placement strategies and constraints is how they interact with service auto-scaling and rolling updates. When a service scales up, new tasks are launched according to the defined strategies and constraints. During a rolling update, ECS stops old tasks and starts new ones. The placement logic is applied to these new task launches. If your constraints are too restrictive, or if your cluster runs out of capacity matching your criteria, both scaling and updates can stall, leaving your service in a degraded state. This is why it’s crucial to ensure your cluster has sufficient and appropriately tagged resources to meet the demands of your placement configurations.
The next hurdle you’ll likely encounter is managing placement for services that have very specific networking requirements, such as needing to be on a particular subnet or within a specific VPC security group, often managed via awsvpc network mode.