Pod garbage collection isn’t just about cleaning up deleted pods; it’s a critical mechanism for managing cluster resources and preventing runaway resource consumption.

Let’s see it in action. Imagine you have a deployment that scales up rapidly, creating dozens of pods. When you scale it down, or when pods fail and are replaced, the Kubernetes control plane needs to eventually remove the old, terminating pods from etcd. If this cleanup process stalls, these "ghost" pods can consume valuable etcd storage and clutter API server lists, slowing down operations.

Here’s a typical scenario: a pod goes into a Terminating state. It should, under normal circumstances, be cleaned up by the kube-controller-manager. But what if it gets stuck?

The primary component responsible for garbage collection of pods is the kube-controller-manager. It watches for pods that have been deleted from the API server but still exist in etcd. When it detects these orphaned pods, it initiates their removal.

Common Causes for Stuck Pod Garbage Collection

  1. kube-controller-manager Not Running or Crashing:

    • Diagnosis: Check the status of the kube-controller-manager pods in the kube-system namespace:
      kubectl get pods -n kube-system | grep kube-controller-manager
      
      If the pods are not Running or are in a CrashLoopBackOff state, this is your primary suspect. Examine their logs:
      kubectl logs <kube-controller-manager-pod-name> -n kube-system
      
    • Fix: Address the underlying reason for the kube-controller-manager crashing. This could be resource starvation (CPU/memory), misconfiguration in its manifest, or issues with its dependencies (like etcd or API server connectivity). Restarting the control plane components or resolving the resource issue will allow it to resume garbage collection.
    • Why it works: The kube-controller-manager is the only component that actively cleans up terminated pods from etcd. If it’s not running, no cleanup happens.
  2. Etcd Performance Issues:

    • Diagnosis: Monitor etcd’s performance. High latency, frequent disconnections, or high CPU/disk I/O on etcd nodes can impede the controller manager’s ability to read from and write to etcd. Check etcd logs for errors related to disk performance or network issues.
      # On etcd nodes
      journalctl -u etcd -f
      # Or check metrics if you have Prometheus/Grafana set up
      
    • Fix: Optimize etcd performance. This might involve ensuring etcd nodes have fast SSDs, adequate CPU and RAM, and a stable network. For persistent issues, consider scaling up etcd resources or tuning etcd’s --auto-compaction-retention and –quota-backend-bytes settings.
    • Why it works: Etcd is the cluster’s brain. If it’s sluggish, the kube-controller-manager cannot efficiently query for and delete terminated pods.
  3. API Server Overload or Unresponsiveness:

    • Diagnosis: If the API server is overloaded with requests, the kube-controller-manager might experience timeouts or delays when trying to interact with it. Check API server logs and metrics for high latency or error rates.
      # On API server nodes
      kubectl logs <api-server-pod-name> -n kube-system
      # Check API server metrics for request latency and error counts
      
    • Fix: Scale up the API server replicas or ensure it has sufficient resources. Identify and address the source of excessive API requests if possible.
    • Why it works: The kube-controller-manager relies on the API server to communicate with etcd and other cluster components. API server unresponsiveness directly impacts the controller manager’s operations.
  4. Network Partition or Firewall Issues:

    • Diagnosis: Verify that the kube-controller-manager pods can reach the etcd cluster and the API server endpoints. Check network policies, firewall rules, and CNI configurations.
      # From a kube-controller-manager pod
      kubectl exec -it <kube-controller-manager-pod-name> -n kube-system -- nc -vz <etcd-host> <etcd-port>
      kubectl exec -it <kube-controller-manager-pod-name> -n kube-system -- nc -vz <api-server-host> <api-server-port>
      
    • Fix: Correct any misconfigured network policies, firewall rules, or CNI settings that are blocking communication between the kube-controller-manager, API server, and etcd.
    • Why it works: If the controller manager can’t talk to etcd or the API server due to network issues, it can’t perform its garbage collection duties.
  5. Stuck Pod Finalizers:

    • Diagnosis: A pod stuck in Terminating state with no apparent progress often indicates an issue with its finalizers. A finalizer is a key-value pair that signals that the Kubernetes garbage collector must wait for a specific controller to perform some cleanup before the object can be deleted. If the controller responsible for a finalizer fails or is not implemented correctly, the pod will remain stuck.
      kubectl get pod <stuck-pod-name> -n <namespace> -o yaml
      
      Look for the metadata.finalizers field. If it’s present and the associated controller isn’t acting, that’s the problem.
    • Fix: Manually remove the finalizer from the pod’s definition. This is typically done by editing the pod object directly:
      kubectl edit pod <stuck-pod-name> -n <namespace>
      
      Then, remove the entry under metadata.finalizers:. Caution: This bypasses the normal cleanup process associated with the finalizer. Ensure you understand which finalizer is present and what cleanup it was intended to perform.
    • Why it works: By removing the finalizer, you tell Kubernetes that the external cleanup process is no longer required, allowing the garbage collector to proceed with deleting the pod from etcd.
  6. Resource Quotas or Limit Ranges:

    • Diagnosis: While less common for stuck garbage collection, aggressive resource quotas or limit ranges could, in theory, impact the kube-controller-manager’s ability to operate if it’s running as a pod itself and constrained. Check the kube-controller-manager pod’s resource requests/limits and any relevant quotas in its namespace.
    • Fix: Adjust resource quotas or limit ranges if they are excessively restrictive for control plane components.
    • Why it works: Ensures the kube-controller-manager has sufficient resources to perform its background tasks.

After resolving these issues, you’ll likely encounter the next challenge: ensuring your cluster state is consistent and that no lingering objects (like Terminating namespaces or other stuck resources) remain.

Want structured learning?

Take the full Argo-workflows course →