ECS service quotas are hard limits imposed by AWS on your account that can prevent your services from scaling or even starting.
Let’s see ECS in action. Imagine you have a simple web service running on ECS. You’ve configured your service to run 5 tasks, and each task uses 2 vCPUs and 4 GiB of memory. Your task_definition looks something like this:
{
"family": "my-web-app",
"cpu": "2048",
"memory": "4096",
"networkMode": "awsvpc",
"containerDefinitions": [
{
"name": "app-container",
"image": "public.ecr.aws/nginx/nginx:latest",
"portMappings": [
{
"containerPort": 80,
"protocol": "tcp"
}
]
}
]
}
And your service definition might have a desired count of 5:
{
"cluster": "my-cluster",
"serviceName": "my-web-service",
"taskDefinition": "my-web-app",
"desiredCount": 5,
"launchType": "FARGATE",
"networkConfiguration": {
"awsvpcConfiguration": {
"subnets": ["subnet-xxxxxxxxxxxxxxxxx", "subnet-yyyyyyyyyyyyyyyyy"],
"assignPublicIp": "ENABLED"
}
}
}
When you create this service, ECS will try to provision the necessary resources. It will ask the underlying AWS infrastructure to launch 5 tasks, each requesting 2 vCPUs and 4 GiB of memory. But what if you’ve already used up your allocation for those resources? That’s where service quotas come in.
The core problem ECS service quotas solve is preventing runaway resource consumption and ensuring fair usage across all AWS customers. AWS has limits on how many of certain resources you can provision within a region. These aren’t just arbitrary numbers; they’re often tied to the physical capacity of their data centers and the need to maintain performance for everyone. For ECS, these limits manifest in several key areas:
- vCPU limits: This is the total number of vCPUs you can have running across all your ECS tasks in a region. This applies to both Fargate and EC2 launch types.
- Memory limits: Similar to vCPUs, this is the total amount of memory (in GiB) your running ECS tasks can consume.
- Number of services per cluster: A limit on how many distinct ECS services you can create within a single ECS cluster.
- Number of tasks per service: While you can often scale a service up to a large number of tasks, there’s an underlying limit to how many tasks a single service can manage.
- Number of task definitions per family: A limit on how many versions of a specific task definition you can have.
Let’s say your task_definition requests 2 vCPUs and 4 GiB of memory. If you try to launch 5 tasks, that’s a total of 10 vCPUs and 20 GiB of memory for that service. If your account’s default vCPU quota for the region is only 8, your service will fail to start because you’re trying to exceed that limit.
The exact quotas and their default values can be found in the AWS documentation, but the most common ones to watch are the vCPU and memory limits. You can check your current usage and quotas in the AWS console under "Service Quotas." Navigate to "Elastic Container Service (ECS)" and then look for "vCPU" and "Memory (GiB)."
The mental model to build is that ECS is an orchestrator, but it still needs to ask AWS for actual compute and memory resources. Service quotas are the gatekeepers that AWS places between ECS and the underlying infrastructure. If ECS requests more than the quota allows, AWS denies the request, and your ECS service creation or scaling operation fails.
The levers you control are primarily your desiredCount in your service definition and the cpu and memory values in your task_definition. You also control the number of services you create per cluster and the number of clusters you use.
One thing that often surprises people is that the vCPU and memory quotas are account-wide and region-specific. This means if you have multiple ECS clusters or services in the same region, their resource consumption all counts towards the same total quota. A small Fargate task for a background job in one cluster can consume vCPUs that prevent a critical web service in another cluster from scaling.
The next concept to explore is how to effectively monitor and manage these quotas, including setting up alerts before you hit them.