Debug CoreDNS NXDOMAIN Errors in Kubernetes (2026)

CoreDNS is failing to resolve internal Kubernetes service names, resulting in NXDOMAIN errors for pods trying to reach other services.

Common Causes and Fixes for CoreDNS `NXDOMAIN` Errors

CoreDNS Pods Not Running or Crashing:
- Diagnosis: Check the status of CoreDNS pods in the kube-system namespace.
```
kubectl get pods -n kube-system -l k8s-app=kube-dns
```
  Look for pods in CrashLoopBackOff, Error, or Terminating states. If there are no pods, it means the deployment failed.
- Fix: If pods are crashing, check their logs for specific errors.
```
kubectl logs <coredns-pod-name> -n kube-system
```
  Common reasons for crashing include insufficient resources (CPU/memory) or misconfiguration in the Corefile. Increase resource requests/limits in the CoreDNS deployment manifest if necessary.
```
# Example snippet from CoreDNS deployment manifest
resources:
  requests:
    cpu: "100m"
    memory: "70Mi"
  limits:
    cpu: "200m"
    memory: "140Mi"
```
  This provides the necessary headroom for CoreDNS to operate, preventing out-of-memory or CPU starvation issues that lead to crashes.
- Why it works: CoreDNS needs stable, running instances to perform DNS lookups. When pods fail, the DNS service becomes unavailable.
Incorrect Corefile Configuration:
- Diagnosis: Examine the Corefile ConfigMap used by CoreDNS.
```
kubectl get configmap coredns -n kube-system -o yaml
```
  Look for syntax errors, incorrect zone definitions, or missing essential plugins. A common mistake is an incorrect . (root zone) configuration or a missing kubernetes plugin.
- Fix: Edit the Corefile ConfigMap and correct any errors.
```
kubectl edit configmap coredns -n kube-system
```
  Ensure it looks similar to this, with the kubernetes plugin correctly configured for the cluster domain (usually cluster.local):
```
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
# ... other metadata
```
  The kubernetes plugin is crucial; it tells CoreDNS how to resolve *.svc.cluster.local and *.pod.cluster.local names by querying the Kubernetes API. The forward directive handles external lookups.
- Why it works: The Corefile is the configuration brain of CoreDNS. Correctly defining the kubernetes plugin ensures it knows how to query the cluster’s DNS records.
Service Discovery Issues (API Server Unreachable):
- Diagnosis: If the kubernetes plugin in the Corefile is misconfigured or CoreDNS cannot reach the Kubernetes API server, it won’t be able to discover services. Check CoreDNS logs for errors related to kubernetes plugin or API server connectivity.
```
kubectl logs <coredns-pod-name> -n kube-system | grep "kubernetes.*failed"
```
- Fix: Ensure CoreDNS pods have network connectivity to the Kubernetes API server. This usually involves checking network policies, CNI configuration, or firewall rules. If CoreDNS is running in a different network namespace or has restrictive network policies, it might be blocked.
```
# Example: Check if CoreDNS pod can curl the API server
kubectl exec -n kube-system <coredns-pod-name> -- curl -k https://kubernetes.default.svc.cluster.local
```
  If this fails, investigate network policies in kube-system or upstream network configurations.
- Why it works: The kubernetes plugin relies on watching API server endpoints and services. If it can’t reach the API server, it can’t populate its internal cache of cluster services.
Incorrect resolv.conf in Pods:
- Diagnosis: Pods are configured to use CoreDNS as their DNS server via their resolv.conf file. Check the resolv.conf of a pod that’s experiencing NXDOMAIN errors.
```
kubectl exec <pod-name> -- cat /etc/resolv.conf
```
  The nameserver entry should point to the ClusterIP of the CoreDNS service.
- Fix: The resolv.conf is typically managed by the kubelet. Ensure kubelet is configured correctly to provide DNS to pods. If it’s incorrect, restarting kubelet or checking its configuration (/var/lib/kubelet/config.yaml or command-line flags) might be needed. The ClusterIP for the kube-dns service should be correct.
```
kubectl get svc kube-dns -n kube-system -o jsonpath='{.spec.clusterIP}'
```
  This IP should be listed as the nameserver in /etc/resolv.conf.
- Why it works: Pods use the resolv.conf to know which DNS server to query. If this file points to the wrong IP or an unreachable server, DNS resolution will fail.
CoreDNS Service Not Available or Incorrect ClusterIP:
- Diagnosis: Verify that the kube-dns service exists and has the correct ClusterIP.
```
kubectl get svc kube-dns -n kube-system
```
  The output should show a CLUSTER-IP and PORT(S) like 53/UDP,53/TCP.
- Fix: If the service is missing or has an incorrect ClusterIP, it needs to be recreated or fixed. This is often tied to the Kubernetes control plane’s DNS configuration. Ensure the kube-dns service definition is present in the Kubernetes control plane’s manifest or is being managed correctly.
```
# Example: If kube-dns service is missing, apply its default definition
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/kube-dns/kube-dns-svc.yaml
```
  This ensures a stable endpoint for pods to query.
- Why it works: The kube-dns service acts as the stable, internal IP address that all pods are configured to use for DNS resolution. If this service is broken, pods can’t find CoreDNS.
Network Policies Blocking DNS Traffic:
- Diagnosis: If network policies are in place, they might be preventing pods from reaching the CoreDNS service on port 53 (UDP/TCP). Check network policy definitions in the namespace where the client pod resides and in kube-system.
```
kubectl get networkpolicy -n <client-pod-namespace>
kubectl get networkpolicy -n kube-system
```
  Look for policies that might deny egress traffic to kube-system or specifically to the kube-dns service’s ClusterIP.
- Fix: Add an allow rule to the relevant network policy that permits egress traffic from pods to the kube-dns service (or all pods in kube-system) on port 53.
```
# Example: Allow egress to kube-dns service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: <client-pod-namespace>
spec:
  podSelector: {} # Applies to all pods in the namespace
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          k8s-app: kube-dns # Label for CoreDNS pods
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
```
  This explicitly permits DNS traffic, bypassing any implicit deny rules.
- Why it works: Network policies enforce traffic segmentation. Without a specific allow rule, traffic to CoreDNS might be blocked by default, preventing resolution.

The next error you’ll likely encounter after fixing NXDOMAIN issues is a CrashLoopBackOff on CoreDNS pods if the underlying resource constraints haven’t been addressed.

Common Causes and Fixes for CoreDNS NXDOMAIN Errors

Common Causes and Fixes for CoreDNS `NXDOMAIN` Errors