Container Insights is a more powerful way to collect, aggregate, and summarize metrics and logs from your containerized applications and microservices.

Here’s an EKS cluster running a simple Nginx deployment and collecting metrics with Container Insights:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: LoadBalancer

This setup deploys three Nginx pods and exposes them via a LoadBalancer service. Now, let’s enable Container Insights.

The core of Container Insights relies on a DaemonSet that runs a fluentd or fluent-bit collector on each EKS node. This collector is responsible for gathering container logs and metrics and forwarding them to CloudWatch.

To enable Container Insights, you need to deploy the CloudWatch agent as a DaemonSet. This DaemonSet is typically managed by an AWS-provided Helm chart or a CloudFormation template.

Here’s how you’d typically install it using the AWS CLI and kubectl after setting up the necessary IAM permissions (which are crucial and often overlooked – ensure your EKS nodes have an IAM role that allows them to send data to CloudWatch Logs and Metrics):

First, create a namespace for the agent:

kubectl create namespace amazon-cloudwatch

Next, apply the CloudWatch agent DaemonSet manifest. You can get the latest version of this manifest from the official AWS documentation. A simplified example of what’s inside that manifest looks like this, focusing on the key parts:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cloudwatch-agent
  namespace: amazon-cloudwatch
spec:
  selector:
    matchLabels:
      app: cloudwatch-agent
  template:
    metadata:
      labels:
        app: cloudwatch-agent
    spec:
      containers:
      - name: cloudwatch-agent
        image: public.ecr.aws/amazoncloudwatch/cloudwatch-agent:latest # Use a specific, stable version in production
        command:
          - /opt/aws/bin/cloudwatch-agent-ctl
          - -a
          - fetch-config
          - -m
          - ec2
          - -c
          - ssm:AmazonCloudWatch-ContainerInsights-EKS
          - -f
          - /etc/cloudwatch-agent/container-insights-fluent-bit-config.yaml
        volumeMounts:
        - name: host-log
          mountPath: /var/log
        - name: host-docker
          mountPath: /var/lib/docker
        - name: host-run
          mountPath: /var/run/docker.sock
      volumes:
      - name: host-log
        hostPath:
          path: /var/log
      - name: host-docker
        hostPath:
          path: /var/lib/docker
      - name: host-run
        hostPath:
          path: /var/run/docker.sock

The critical part of this configuration is the fetch-config command. It tells the agent to download a pre-defined configuration for Container Insights from AWS Systems Manager (SSM) Parameter Store. The parameter name ssm:AmazonCloudWatch-ContainerInsights-EKS is key here – it specifies the Container Insights configuration tailored for EKS. This configuration includes rules for parsing logs and collecting specific metrics like CPU, memory, network, and disk I/O from containers, pods, and nodes.

Once the DaemonSet is running, you should start seeing metrics and logs appearing in CloudWatch under the "Container Insights" section. You’ll find aggregated views for your cluster, nodes, pods, and services.

The system works by having the fluent-bit (or fluentd) process, running as a DaemonSet, collect container runtime information (like Docker or containerd stats) and logs. It then formats this data and sends it to CloudWatch Logs for log aggregation and to CloudWatch Metrics with specific dimensions (like ClusterName, PodName, ContainerName) for detailed performance analysis.

The container-insights-fluent-bit-config.yaml file, fetched via SSM, contains the parsing rules and output destinations. It’s configured to look for specific log formats and to collect metrics from the container runtime API.

The most surprising true thing about Container Insights is that it doesn’t just collect metrics; it also correlates them with your Kubernetes resource definitions. When you look at a pod’s metrics in CloudWatch, you’ll see dimensions that directly map back to Kubernetes concepts like PodName, Namespace, ClusterName, and even ReplicaSet and Deployment names, allowing you to trace performance issues from the node all the way up to your application deployment.

The next concept you’ll want to explore is setting up custom metrics and log forwarding for your applications, beyond what Container Insights provides out-of-the-box.

Want structured learning?

Take the full Eks course →