ECS Container Insights offers a deeper look into your cluster’s performance than standard CloudWatch metrics.
Let’s see it in action. Imagine you have a service running in ECS, and you want to know not just CPU utilization of the task, but also the disk I/O, network traffic within the container, and even the number of file descriptors open. Standard ECS metrics give you the former. Container Insights gives you the latter, by collecting and aggregating detailed performance logs from your containers.
Here’s a simple service definition in task-definition.json:
{
"family": "my-app-service",
"networkMode": "awsvpc",
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "2048",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"name": "my-app-container",
"image": "nginx:latest",
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-app-service",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
To enable Container Insights, you don’t modify the task definition itself. Instead, you enable it at the cluster level. When you create or update an ECS cluster, there’s a checkbox or a parameter for Container Insights. For example, using the AWS CLI to create a cluster:
aws ecs create-cluster \
--cluster-name my-insights-cluster \
--configuration execute-command-configuration='{"kmsKeyId":"","enable-execute-command":false}' \
--tags key=Name,value=MyInsightsCluster key=Environment,value=Dev
And then, to enable Container Insights on an existing cluster:
aws ecs put-cluster-capacity-providers \
--cluster my-insights-cluster \
--capacity-providers ECS_క్షణ_CAPACITY_PROVIDER \
--default-capacity-provider-strategy capacityProvider=ECS_క్షణ_CAPACITY_PROVIDER,weight=1
The key is to enable it via the cluster settings. Once enabled, ECS automatically deploys a sidecar agent called ecs-agent to your cluster’s EC2 instances (if using EC2 launch type) or manages it for you (if using Fargate). This agent is responsible for collecting the detailed metrics.
The ecs-agent then pushes these metrics to CloudWatch Logs. Specifically, it creates a new log group named /aws/ecs/containerinsights/CLUSTER_NAME/performance and streams JSON-formatted performance data into it. This data includes metrics like cpuReserved, cpuUtilized, memoryReserved, memoryUtilized, diskReadOps, diskWriteOps, networkRxBytes, networkTxBytes, and fileDescriptorAllocated.
These logs are then automatically processed by CloudWatch Container Insights to create custom metrics and dashboards. You can find these in the CloudWatch console under "Metrics" -> "All metrics" -> "ECS" -> "Container Insights". You’ll see metrics broken down by cluster, service, task, and container.
The problem this solves is the "black box" nature of standard container monitoring. You might see your task’s CPU spike, but you wouldn’t know why. Is it the application itself, or is it something external like excessive network I/O or disk contention? Container Insights gives you that granularity. It aggregates these detailed metrics, making it easy to spot performance bottlenecks at the container level, not just the task or service level. You can then create CloudWatch Alarms based on these specific container metrics. For instance, you could alarm if fileDescriptorAllocated exceeds 80% of the limit for any container in a critical service.
The most surprising thing about Container Insights is how it handles Fargate. With Fargate, you don’t manage the underlying instances, so you might assume you’d have less visibility. However, Container Insights works seamlessly by having the Fargate control plane collect and aggregate the metrics on your behalf. You just enable it at the cluster level, and AWS handles the rest, pushing the same detailed performance data to your CloudWatch logs and metrics.
The next step after enabling Container Insights is to explore how to create custom dashboards that combine these granular metrics with your application logs for a holistic view of your service’s health.