crictl lets you bypass Kubernetes’ abstractions and talk directly to the container runtime, which is surprisingly useful for debugging.
Let’s see crictl in action. Imagine a pod that’s stuck in a CrashLoopBackOff state.
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-broken-app-7f7f8d9f9c-abcde 0/1 CrashLoopBackOff 5 3m 10.244.0.10 worker-node-1 <none> <none>
$ kubectl logs my-broken-app-7f7f8d9f9c-abcde
Error from server (BadRequest): container "my-app" in pod "my-broken-app-7f7f8d9f9c-abcde" is waiting to start: ContainerCreating
kubectl logs is giving us a hint, but it’s not enough to figure out why it’s stuck in ContainerCreating. This is where crictl shines.
First, we need to find the container ID. We can do this by listing pods managed by the container runtime on the node where the pod is scheduled.
$ kubectl get pod my-broken-app-7f7f8d9f9c-abcde -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-broken-app-7f7f8d9f9c-abcde 0/1 CrashLoopBackOff 5 3m 10.244.0.10 worker-node-1 <none> <none>
# SSH into worker-node-1
$ ssh worker-node-1
# List pods with crictl, filtering by the pod name prefix
$ sudo crictl pods | grep my-broken-app-7f7f8d9f9c
<long-hex-string> my-broken-app-7f7f8d9f9c-abcde Ready 0 10.244.0.10 worker-node-1 <none>
# Get the sandbox ID from the previous command (the first hex string)
# Let's say it's a1b2c3d4e5f6
SANDBOX_ID="a1b2c3d4e5f6"
# List containers within that sandbox
$ sudo crictl ps --sandbox-id $SANDBOX_ID
CONTAINER ID IMAGE COMMAND CREATED STATE NAME ATTEMPT
<container-id> <your-image-name>:<tag> <command> 2 minutes ago Created my-app 0
Now we have the CONTAINER ID. We can use this to inspect the container’s state and logs directly.
# Inspect the container's status and configuration
$ sudo crictl inspect <container-id>
{
"status": {
"id": "<container-id>",
"podSandboxId": "a1b2c3d4e5f6",
"image": {
"image": "<your-image-name>:<tag>"
},
"state": 3, // 3 typically means 'Created' or 'Exited'
"createdAt": 1678886400000000000,
"startedAt": 0, // Zero indicates it never started
"finishedAt": 0,
"exitCode": 1, // Non-zero exit code
"reason": "Error",
"message": "failed to create container: OCI runtime create failed: ...", // This is the crucial part!
// ... other details
}
}
The message field in the inspect output is gold. It often contains the direct error from the underlying OCI runtime (like runc or crun). In this case, OCI runtime create failed suggests an issue with how the container was being set up, not necessarily an application error within the container.
If the container had run and then exited, crictl logs would be your friend:
# If the container exited, you can fetch its logs
$ sudo crictl logs <container-id>
<Actual error message from the application or runtime>
The most common cause of a pod stuck in ContainerCreating or CrashLoopBackOff when kubectl logs shows nothing useful is a problem with the container image itself or its configuration during creation. This could be:
-
Image Pull Issues: The container runtime cannot find or pull the specified image.
- Diagnosis:
sudo crictl imagesto see if the image is present locally. Checksudo crictl inspectImage <image-name>:<tag>. - Fix: Ensure the image name and tag are correct. If it’s a private registry, verify
imagePullSecretsare correctly configured in your Kubernetes deployment and that the registry credentials are valid. - Why it works:
crictl imagesdirectly queries the runtime’s image store, bypassing Kubernetes’ API layer.
- Diagnosis:
-
Invalid Image Command/Args: The
commandorargsspecified in the Pod spec are malformed or point to non-existent executables within the image.- Diagnosis:
sudo crictl inspect <container-id>and examine theArgsandEntrypointfields within theconfigsection of the inspect output. - Fix: Correct the
commandandargsin your Kubernetes deployment YAML. - Why it works:
crictl inspectshows the exact arguments the runtime received, allowing you to spot typos or incorrect paths.
- Diagnosis:
-
Filesystem/Volume Mount Problems: The container needs to access a volume or path that doesn’t exist, is inaccessible, or has incorrect permissions.
- Diagnosis: Look for messages like "no such file or directory" or "permission denied" in the
crictl inspect <container-id>output’smessagefield, or insudo crictl logs <container-id>if it ever started. - Fix: Verify PersistentVolumeClaims (PVCs) are bound, check underlying storage, and ensure the
securityContextin your pod spec doesn’t conflict with the filesystem permissions. - Why it works: Runtime errors during mount operations will often surface in
crictl inspector logs, indicating the problematic path or permission.
- Diagnosis: Look for messages like "no such file or directory" or "permission denied" in the
-
Resource Constraints: The node doesn’t have enough CPU, memory, or ephemeral storage to create the container.
- Diagnosis: Check
dmesgoutput on the node for OOM killer messages or other kernel-level resource exhaustion errors related to the container runtime. - Fix: Increase node resources, reduce pod resource requests/limits, or use a node with more capacity.
- Why it works: Kernel-level messages are the ultimate source of truth for resource exhaustion, and
crictlwill reflect the resulting container state.
- Diagnosis: Check
-
OCI Runtime Errors: Underlying issues with the container runtime itself (e.g.,
containerd,cri-o) or its configuration.- Diagnosis: Check the runtime’s service logs:
sudo journalctl -u containerdorsudo journalctl -u crio. Look for errors around the time the pod was scheduled. - Fix: Restart the container runtime service, or investigate the specific error in the runtime’s logs.
- Why it works:
crictlis a client to the runtime; its logs show the actual problems the runtime is encountering.
- Diagnosis: Check the runtime’s service logs:
-
Security Context/AppArmor/SELinux Issues: The container’s security profile prevents it from starting or performing necessary operations.
- Diagnosis: Examine
sudo journalctl -kforAVC deniedmessages (SELinux) oraudit.logfor AppArmor denials. - Fix: Adjust
securityContext, AppArmor profiles, or SELinux policies in your pod spec or on the node. - Why it works: These security mechanisms operate at the kernel level and log their denials, which
crictl’s output might indirectly reflect or which can be found via system logs.
- Diagnosis: Examine
After fixing the underlying issue (e.g., correcting an image tag, fixing volume permissions), the pod should eventually transition to Running.
The next problem you’ll likely hit is understanding how crictl interacts with Kubernetes’ internal objects.