Kata Containers can be used as a runtime for containerd, offering enhanced VM-level isolation for your containers.

Here’s containerd configured to use Kata Containers as its default runtime:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
  privileged_without_host_devices = true
  runtime_type = "io.containerd.kata.v2"
  runtime_root = "/opt/kata/rootfs"
  sandbox_cgroup_parent = "kubepods"
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata.options]
    SystemdCgroup = true

This configuration tells containerd to delegate container creation to Kata. When containerd receives a request to run a container that specifies the kata runtime (or if kata is set as the default), it will invoke Kata Containers. Kata then spins up a lightweight virtual machine (VM) and runs the container workload within that isolated VM. This provides a stronger security boundary than traditional container runtimes, as workloads are separated by a hypervisor and a full operating system kernel.

The problem Kata Containers solves is the inherent security limitations of traditional containerization. While containers offer process-level isolation, they share the host kernel. A kernel exploit in one container could potentially compromise the entire host system and all other containers running on it. Kata Containers mitigate this risk by running each container (or a pod of containers) inside its own minimal VM. This VM has its own kernel, separate from the host’s kernel, effectively creating a much stronger isolation boundary.

Internally, when containerd requests Kata to run a container, Kata uses a shim process. This shim is responsible for setting up the VM environment, launching the guest kernel, and then executing the container’s processes within that VM. The shim also manages the lifecycle of the VM and the container, reporting status back to containerd. Kata leverages technologies like QEMU/KVM, Firecracker, or Cloud Hypervisor to create these VMs. The runtime_type = "io.containerd.kata.v2" in the containerd configuration specifically points to the Kata Containers runtime interface.

The sandbox_cgroup_parent = "kubepods" option is particularly relevant in Kubernetes environments. It ensures that the cgroups for the Kata sandbox (the VM) are managed under the appropriate Kubernetes hierarchy, allowing Kubernetes to enforce resource limits and QoS for Kata-based pods as it would for standard pods. SystemdCgroup = true ensures that systemd within the Kata VM uses its cgroup driver, which is important for proper service management and resource control within the guest. privileged_without_host_devices = true is a security hardening option that allows Kata to run containers that require certain privileges (like CAP_SYS_ADMIN) without exposing host devices directly.

The runtime_root parameter, set to /opt/kata/rootfs, specifies the directory where Kata’s root filesystem images are stored. These are minimal OS images tailored for running containers within the Kata VM. When a new Kata sandbox is created, Kata will typically use one of these images as the root filesystem for the VM.

The one thing most people don’t realize is how granularly Kata can manage its VM resources. While it’s designed for isolation, it also has mechanisms to tune the VM’s CPU, memory, and even network configuration per sandbox. This means you’re not just getting a black-box VM; you can often adjust the underlying VM’s characteristics to optimize performance or resource usage for specific workloads, without sacrificing the isolation benefits. This is managed through Kata’s configuration files and can be influenced by pod specifications in orchestration systems.

The next concept you’ll want to explore is how to integrate Kata Containers with Kubernetes for robust, VM-isolated pod deployments.

Want structured learning?

Take the full Containerd course →