Troubleshoot Kubernetes DNS Resolution Failures with CoreDNS (2026)

CoreDNS failed to resolve external hostnames because it couldn’t reach the upstream DNS servers it was configured to use.

This isn’t just a "network problem"; it’s a failure in the intricate dance between your Kubernetes cluster and the outside world. CoreDNS, the cluster’s internal DNS resolver, is supposed to act as a gateway, forwarding requests it can’t handle internally to external DNS servers. When that gateway jams, nothing outside the cluster is discoverable by name.

Here’s how to diagnose and fix it, starting with the most common culprits:

1. NetworkPolicy Blocking Egress Traffic

Diagnosis: Check if any NetworkPolicy resources are restricting egress traffic from the kube-system namespace (where CoreDNS typically runs) to the internet or specific IP ranges.
```
kubectl get networkpolicy -n kube-system
```
Look for policies that might have egress rules with empty to fields or ipBlock entries that don’t include your upstream DNS server IPs (e.g., 8.8.8.8/32, 1.1.1.1/32).

Fix: If a restrictive NetworkPolicy is found, either modify it to allow egress to your upstream DNS servers on UDP/TCP port 53, or create a new policy that explicitly permits this traffic.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-coredns-egress
  namespace: kube-system
spec:
  podSelector:
    matchLabels:
      k8s-app: kube-dns # This label often targets CoreDNS pods
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 8.8.8.8/32 # Example: Google Public DNS
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Add other necessary egress rules for other upstream DNS servers or services

Why it works: NetworkPolicy is Kubernetes’ way of enforcing network segmentation. By default, pods can communicate freely. This policy explicitly grants permission for CoreDNS pods to send DNS queries (UDP/TCP 53) to external IP addresses, bypassing any overly restrictive default egress rules.

2. Incorrect Upstream DNS Configuration in CoreDNS ConfigMap

Diagnosis: CoreDNS uses a ConfigMap (often named coredns) in the kube-system namespace to define its behavior. Examine this ConfigMap for the forward directive.

kubectl get configmap coredns -n kube-system -o yaml

Look for a section like this within the data.Corefile:

.:53 {
    errors
    health {
       lameduck 5s
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    # This is the crucial part:
    forward . 8.8.8.8 1.1.1.1 {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}

Ensure the IP addresses listed in the forward directive are correct and reachable from your cluster nodes.

Fix: Edit the coredns ConfigMap to correct the IP addresses in the forward directive.
```
kubectl edit configmap coredns -n kube-system
```
Update the forward line with the correct upstream DNS server IPs (e.g., your internal DNS servers, or public ones like 8.8.8.8 and 1.1.1.1).
Why it works: The forward plugin tells CoreDNS where to send DNS queries it can’t resolve locally (like google.com). If these upstream IPs are wrong or unreachable, CoreDNS will fail to get answers for external hostnames. Correcting them ensures CoreDNS has valid destinations for its forwarded queries.

3. CoreDNS Pods Not Running or Crashing

Diagnosis: Check the status of the CoreDNS pods.
```
kubectl get pods -n kube-system -l k8s-app=kube-dns
```
If pods are in CrashLoopBackOff, Error, or Terminating states, investigate their logs.
```
kubectl logs <coredns-pod-name> -n kube-system
```
Look for errors indicating configuration parsing issues, failing health checks, or resource exhaustion.
Fix:
- Resource Limits: If logs show OOMKilled or similar resource issues, increase the CPU and memory requests/limits for the CoreDNS pods in their Deployment.
```
kubectl edit deployment coredns -n kube-system
```
  Adjust resources.requests and resources.limits accordingly.
- Configuration Errors: If logs show parsing errors in the Corefile, fix the ConfigMap (as in point 2).
- Node Issues: If pods are stuck Pending, check node resources or taints/tolerations.
Why it works: CoreDNS must be running and healthy to perform its DNS resolution duties. Addressing resource constraints or configuration errors allows the CoreDNS pods to start and operate correctly.

4. Node-Level DNS Configuration Issues (iptables/kube-proxy)

Diagnosis: Kubernetes uses kube-proxy and iptables (or IPVS) to route DNS traffic. Sometimes, node-level network configurations can interfere. Check iptables rules on the node where the CoreDNS pod is running.
```
# SSH into the node and run:
sudo iptables-save | grep 53
```
Look for rules that might be dropping or misdirecting UDP/TCP traffic on port 53, especially to the node’s IP address or the cluster’s service IP for DNS. Also, check if kube-proxy is running correctly on the node.
```
sudo systemctl status kube-proxy
```
Fix: This is often the trickiest. It might involve:
- Restarting kube-proxy on the affected node: sudo systemctl restart kube-proxy.
- Manually flushing iptables rules (use with extreme caution): sudo iptables -F.
- Re-provisioning the node if the iptables state is severely corrupted.
- Ensuring kube-proxy is configured to use the correct network mode (e.g., iptables or ipvs).
Why it works: kube-proxy manages the cluster’s network rules, including those for DNS. If these rules are broken, traffic intended for CoreDNS might be dropped or sent to the wrong place, even if CoreDNS itself is healthy. Restoring or correcting these rules re-establishes the correct network paths.

5. Firewall Rules on Nodes or Network Infrastructure

Diagnosis: Even if NetworkPolicy is permissive, host firewalls (firewalld, ufw, iptables directly on the node) or external network firewalls (e.g., cloud provider security groups, corporate firewalls) might be blocking egress from your Kubernetes nodes to the upstream DNS servers on UDP/TCP port 53.
```
# On a node, try to directly ping or telnet to an upstream DNS server
# (This might be blocked by a firewall too, but can indicate general connectivity)
telnet 8.8.8.8 53
```
Check cloud provider security group rules, AWS NACLs, Azure NSGs, or your on-premises firewall configurations.
Fix: Update firewall rules on the nodes or network infrastructure to allow egress traffic from your Kubernetes nodes (specifically their node IPs) to your chosen upstream DNS servers on UDP and TCP port 53.
Why it works: This is a fundamental network connectivity issue. If the network path is blocked at the infrastructure level, CoreDNS’s requests simply won’t reach their destination, regardless of Kubernetes-internal configurations.

6. DNS Server Unavailability or Misconfiguration

Diagnosis: The upstream DNS servers themselves might be down, overloaded, or misconfigured.
```
# From a node, try to resolve a hostname directly using the upstream server:
dig @8.8.8.8 google.com
```
If this fails consistently, the problem is with the upstream server, not CoreDNS.
Fix: Switch to different, known-good upstream DNS servers in the CoreDNS ConfigMap or troubleshoot the upstream DNS infrastructure.
Why it works: If the target you’re forwarding requests to is broken, your forwarding service can’t succeed. This isolates the problem to the external DNS infrastructure.

After fixing these, you’ll likely encounter the next common issue: pods being unable to reach services within the cluster due to kube-proxy or CNI misconfigurations.