ECS Daemon scheduling is a bit of a hidden gem, and most people think of it as just a way to ensure a container runs on every host. But the real magic is how it lets you treat your EC2 instances not as a pool to draw from, but as individual, distinct compute units for specific tasks.
Let’s see it in action. Imagine you’re running a cluster with a few EC2 instances. You want to deploy a logging agent, a monitoring tool, or a security scanner that must run on every single host.
Here’s a simplified task definition for a hypothetical logging agent:
{
"family": "logging-agent",
"networkMode": "host",
"requiresCompatibilities": ["EC2"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/LoggingAgentTaskRole",
"containerDefinitions": [
{
"name": "fluentd",
"image": "fluent/fluentd:v1.14.5-debian-1",
"portMappings": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/logging-agent",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "fluentd"
}
},
"mountPoints": [
{
"sourceVolume": "logVolume",
"containerPath": "/var/log/containers"
}
],
"user": "1000"
}
],
"volumes": [
{
"name": "logVolume",
"host": {
"sourcePath": "/var/log/containers"
}
}
]
}
Notice networkMode: "host". This is crucial for daemon tasks that need direct access to the host’s network interfaces, like agents collecting host-level metrics or logs. It also means your container will share the host’s network stack, and you don’t need portMappings in the traditional sense unless you’re exposing a port from the container that needs to be accessible on the host. The volumes section maps a host directory (/var/log/containers) into the container, allowing the agent to read host logs.
Now, the magic happens when you create a service and select "DAEMON" as the deployment type.
When you create an ECS service for this task definition, you’d navigate to the ECS console, select your cluster, create a new service, and choose the "DAEMON" launch type.
The key configuration here is under "Service configuration":
- Launch type: EC2
- Service type: DAEMON
You’ll also configure your cluster, the task definition, and importantly, the placement strategy. For daemon, this is implicitly handled, but you can refine it. You’d typically ensure your EC2 instances are registered with ECS and that this service is not configured with any specific placement constraints that would prevent it from running on certain instances (e.g., don’t use attribute:ecs.instance-type if you want it on all instance types).
ECS will then automatically schedule exactly one instance of your logging-agent task on each EC2 instance that is registered with your cluster and meets any specified placement constraints (or none, if you want it everywhere). If you add a new EC2 instance to the cluster, ECS will automatically launch a logging-agent task on it. If an EC2 instance is terminated, the task on it is also terminated.
This isn’t about scaling up or down a pool of containers; it’s about ensuring a specific piece of infrastructure software is present and running on every host, regardless of how many hosts you have. It’s about treating each host as a first-class citizen that needs its own instance of a particular agent.
The most surprising aspect is how this model fundamentally shifts your thinking about distributed systems management. Instead of managing a "service" as a collection of interchangeable replicas, you’re managing a set of individual host agents. When you need to update the agent, you’re not scaling a service; you’re performing a rolling update across all daemon tasks, ensuring each host gets the new version sequentially. You don’t think about "how many logging agents do I need?", you think "every host needs one logging agent".
The networkMode: "host" is often the most misunderstood part of daemon tasks. People try to use bridge mode and wonder why they can’t access host resources. When using host networking, the container uses the host’s network namespace directly. This means the container’s processes will appear to have the host’s IP address, and any ports the container binds to are bound directly to the host’s network interfaces, not to a specific container IP. This is essential for agents that need to interact with the host’s operating system or network stack.
If you’re running a daemon task and need to access host logs, and you’ve mapped /var/log/containers into your container, but the container still can’t see the logs, the most common culprit is the user setting in your container definition. If the user is set to 0 (root) or a user that doesn’t have read permissions on the host’s /var/log/containers directory, you’ll get permission denied errors. Ensure the user specified (or the default user if none is specified) has the necessary file system permissions on the host.
The next thing you’ll likely explore is how to exclude certain EC2 instances from running daemon tasks using placement constraints.