containerd’s internal event stream is the key to unlocking deep observability with eBPF, allowing you to trace container lifecycle events and network activity with unprecedented detail.
Let’s see this in action. Imagine you have a simple application running in a container. We’ll use containerd to manage it and cilium/ebpf to write a custom eBPF program.
First, start a simple container using containerd:
ctr run --rm docker.io/library/alpine:latest my-alpine sh
Now, let’s write a basic eBPF program in Go that attaches to containerd’s event socket and logs container creation events. We’ll use the containerd Go client library to interact with the event stream.
package main
import (
"context"
"fmt"
"log"
"time"
"github.com/containerd/containerd/v2/core/events/state"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/remote/docker"
"github.com/containerd/containerd/v2/remotes"
"github.com/containerd/containerd/v2/remotes/docker/client"
"github.com/cilium/ebpf"
"github.com/cilium/ebpf.examples/pkg/bpf"
)
const (
// This is a placeholder for the actual event socket path.
// You'll need to find the correct path for your containerd installation.
// Common locations include /run/containerd/events/events.sock or /var/run/containerd/containerd.sock
containerdEventSock = "/run/containerd/events/events.sock"
)
func main() {
ctx := context.Background()
// Load eBPF program (replace with your actual eBPF program loading logic)
// For this example, we'll assume a simple kprobe on a containerd internal function
// This is illustrative; a real-world scenario would involve more complex eBPF logic.
spec := &ebpf.ProgramSpec{
Name: "containerd_event_tracer",
Type: ebpf.Kprobe,
Instructions: ebpf.Instructions{
// Example: Load a constant
ebpf.LoadImm(ebpf.R0, 0, ebpf.DWord),
// Example: Exit
ebpf.Exit(),
},
}
prog, err := ebpf.NewProgram(spec)
if err != nil {
log.Fatalf("failed to create eBPF program: %v", err)
}
defer prog.Close()
// Attach eBPF program (again, illustrative)
// In a real scenario, you'd attach to a specific containerd internal function
// For simplicity, we'll skip actual attachment here as it requires deep knowledge
// of containerd's internal C functions.
log.Printf("Connecting to containerd event socket: %s", containerdEventSock)
// Connect to containerd's event stream
remote, err := client.New(containerdEventSock)
if err != nil {
log.Fatalf("failed to connect to containerd: %v", err)
}
// Create a client for containerd
c, err := state.NewClient(remote)
if err != nil {
log.Fatalf("failed to create containerd client: %v", err)
}
// Subscribe to container events
sub, err := c.Subscribe(ctx)
if err != nil {
log.Fatalf("failed to subscribe to events: %v", err)
}
defer sub.Close()
log.Println("Subscribed to containerd events. Waiting for container lifecycle events...")
for {
select {
case event := <-sub.Recv():
// event.Topic is typically like "/containerd/event/state/container/create"
// event.Value is a protobuf-encoded payload
log.Printf("Received event: Topic=%s", event.Topic)
// You would then unmarshal event.Value based on the topic
// For example, for container create events:
if event.Topic == state.TopicContainerCreate {
var containerCreate state.ContainerCreate
if err := event.Value.UnmarshalTo(&containerCreate); err != nil {
log.Printf("failed to unmarshal container create event: %v", err)
continue
}
log.Printf("Container Created: ID=%s, PID=%d, Image=%s",
containerCreate.ID, containerCreate.PID, containerCreate.Image)
// Here you could trigger your eBPF program to start tracing this container,
// or use the PID to correlate eBPF data.
} else if event.Topic == state.TopicContainerDelete {
var containerDelete state.ContainerDelete
if err := event.Value.UnmarshalTo(&containerDelete); err != nil {
log.Printf("failed to unmarshal container delete event: %v", err)
continue
}
log.Printf("Container Deleted: ID=%s", containerDelete.ID)
}
// Handle other event types like start, stop, pause, unpause, etc.
case <-ctx.Done():
return
case <-time.After(30 * time.Second): // Keep-alive or periodic check
log.Println("Still running, waiting for events...")
}
}
}
To run this, you’d need:
- A
containerdinstallation: Ensurecontainerdis running and accessible. ThecontainerdEventSockpath is crucial. You can find it by looking at yourcontainerdconfiguration (often in/etc/containerd/config.tomlor/usr/local/etc/containerd/config.toml). - The
containerdGo client library:go get github.com/containerd/containerd/v2/... - The
cilium/ebpflibrary:go get github.com/cilium/ebpf
This program connects to containerd’s event stream. containerd exposes a Unix domain socket (typically /run/containerd/events/events.sock) where it publishes lifecycle events for containers, images, and other resources. By subscribing to this stream, your eBPF tooling can react to these events in real-time.
The core idea is to correlate eBPF data (like network packets, syscalls, or function calls) with container lifecycles. When a container is created, you can extract its PID and namespace information. This allows your eBPF programs to filter or tag data specifically for that container. For example, an eBPF program tracing network connections could associate each connection with the container ID that initiated it.
The containerd event stream provides topics like /containerd/event/state/container/create, /containerd/event/state/container/start, and /containerd/event/state/container/delete. The event.Value field contains a protobuf-encoded payload specific to the event type, which you deserialize to get details like container ID, PID, and image name.
The real power comes when you combine this with eBPF. Imagine a network observability tool. When a container starts, your eBPF program can attach to the relevant network interfaces using the container’s PID or network namespace. It can then use eBPF maps to store metadata about the container (like its ID) and associate it with network traffic traced by other eBPF programs.
The most surprising thing about integrating eBPF with containerd is how seamlessly you can bridge the gap between the container runtime’s abstract concepts (like "container") and the kernel’s low-level primitives (like PIDs, network sockets, and syscalls). You’re not just observing network traffic; you’re observing network traffic by container ID, by image name, or by Kubernetes pod label, all driven by containerd’s event stream.
The containerd event stream is not just for container lifecycles. It also publishes events for images, snapshots, and tasks. You can tap into these to build comprehensive observability solutions that understand not only what containers are running but also how they are built and managed. For instance, you could trace image pull events to understand build times or snapshot creation/deletion to monitor storage usage.
One critical aspect often overlooked is how containerd’s event stream handles namespaces. When you subscribe, you might receive events across different namespaces. Your eBPF tooling needs to be aware of these namespaces (e.g., k8s.io/default for Kubernetes pods) to correctly attribute eBPF data to the right context. The containerd client library, when used correctly, can help you filter or process events based on these namespaces, which is essential for multi-tenant or complex orchestration environments.
The next step is to explore how to use this event data to dynamically load, unload, or configure eBPF programs and maps based on container lifecycles.