Monitor containerd Health with the Built-In Health Check API (2026)

The most surprising thing about containerd’s health check API is that it doesn’t actually do the checking itself; it’s a declarative contract for other systems to use.

Let’s see it in action. Imagine you’ve got a simple web service running in a container. We’ll define its health check in the container’s runtime configuration.

version: "2.0"

plugins:
  - name: io.containerd.grpc.v1.cri
    module: io.containerd.grpc.v1
    config: {}

containers:
  - id: my-web-app
    image: docker.io/library/nginx:latest
    labels:
      io.containerd.grpc.v1.cri.image-name: docker.io/library/nginx:latest
    # Here's the health check definition
    linux:
      resources: {}
    # This is the crucial part for health checks
    # We're telling containerd that for the 'my-web-app' container,
    # a 'liveness' probe should be executed by running `wget -qO- http://localhost:80/`
    # inside the container. It's considered healthy if the exit code is 0.
    # A 'readiness' probe, which determines if a container is ready to serve traffic,
    # is also defined, using the same command.
    # The 'interval' is how often to run the check, 'timeout' is how long to wait,
    # 'retries' is how many times to try before marking as unhealthy, and
    # 'startPeriod' is a grace period after container start before checks begin.
    # In this example, 'startPeriod' is 10 seconds.
    cri:
      # This is the Kubernetes-style configuration that containerd's CRI plugin understands.
      # The 'livenessProbe' and 'readinessProbe' are standard Kubernetes terms.
      # 'exec' specifies that the probe should be an exec command.
      # 'command' lists the arguments for the command.
      # 'failureThreshold' and 'successThreshold' are also standard Kubernetes probe terms.
      # 'initialDelaySeconds' is equivalent to 'startPeriod' in the containerd config.
      # The 'periodSeconds' is equivalent to 'interval'.
      # 'timeoutSeconds' is equivalent to 'timeout'.
      livenessProbe:
        exec:
          command: ["wget", "-qO-", "http://localhost:80/"]
        initialDelaySeconds: 10
        periodSeconds: 5
        timeoutSeconds: 1
        failureThreshold: 3
      readinessProbe:
        exec:
          command: ["wget", "-qO-", "http://localhost:80/"]
        initialDelaySeconds: 5
        periodSeconds: 5
        timeoutSeconds: 1
        failureThreshold: 3

When you create a container with these livenessProbe and readinessProbe configurations, containerd doesn’t magically know how to check http://localhost:80/. Instead, it registers these specifications. When a container runtime interface (CRI) implementation like Kubernetes or a custom one using containerd’s API polls containerd for container status, it can then use these declared probes. The CRI implementation will execute the specified command (wget -qO- http://localhost:80/ in this case) inside the container at the defined intervals. If the command exits with a non-zero status code, the probe fails.

This separation of concerns is key. containerd acts as the orchestrator and reporter of health status, not the executor of arbitrary health checks. It provides the framework for defining what constitutes health and how often to check, but the actual execution is delegated to the CRI interface or any system interacting with containerd’s health check API. The io.containerd.grpc.v1.cri plugin, for instance, translates these probe definitions into actions performed by the container runtime when queried.

The livenessProbe tells the system if the container is still alive and functioning. If it fails repeatedly, the container is considered unhealthy and typically restarted. The readinessProbe tells the system if the container is ready to accept traffic. If it fails, the container is removed from service endpoints, even if it’s still running. This distinction is vital for managing application availability during deployments or scaling events.

The most common misconception is that containerd itself is running curl or wget commands. It’s not. It’s merely storing the intent that these commands should be run, and a higher-level system (like Kubernetes’ kubelet) is responsible for the actual execution, observing the results, and taking action based on the probe outcomes.

The next problem you’ll likely encounter is understanding how these health checks integrate with your service discovery and load balancing mechanisms.