CoreDNS, when deployed for high availability on Kubernetes, doesn’t inherently round-robin between instances; it relies on a sophisticated health check and endpoint selection mechanism managed by Kubernetes itself.

Let’s see this in action. Imagine a Kubernetes Service object targeting your CoreDNS pods. This Service is what your applications will use to resolve DNS queries.

apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
spec:
  selector:
    k8s-app: kube-dns # This selects your CoreDNS pods
  ports:
    - name: dns
      port: 53
      protocol: UDP
    - name: dns-tcp
      port: 53
      protocol: TCP

When a pod in your cluster needs to resolve kubernetes.default.svc.cluster.local, it sends a DNS query to the kube-dns Service IP. Kubernetes then directs this query to one of the healthy CoreDNS pods backing that Service. The magic isn’t in CoreDNS deciding which of its peers to talk to; it’s Kubernetes’ Service abstraction ensuring that your query lands on an available and responsive CoreDNS instance.

For high availability, you’d typically deploy multiple CoreDNS replicas. A common pattern is to use a Deployment or StatefulSet to manage these replicas.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
spec:
  replicas: 3 # At least 2 for HA, 3 is a good starting point
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      containers:
      - name: coredns
        image: coredns/coredns:1.10.1 # Use a recent, stable version
        args:
        - "-conf=/etc/coredns/Corefile"
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health
            port: 8080 # Default health port if enabled in Corefile
          initialDelaySeconds: 60
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080 # Default health port if enabled in Corefile
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
      volumes:
      - name: config-volume
        configMap:
          name: coredns
          items:
          - key: Corefile
            path: Corefile

The livenessProbe and readinessProbe are critical. The livenessProbe tells Kubernetes if the CoreDNS container is still running and responsive. If it fails, Kubernetes will restart the pod. The readinessProbe tells Kubernetes if the CoreDNS pod is ready to accept traffic. If it fails, Kubernetes will stop sending traffic to that pod via the kube-dns Service. This combination ensures that traffic is only directed to healthy, ready instances.

The Corefile itself is where you configure CoreDNS’s behavior. For high availability, you’re primarily focused on ensuring your cluster DNS is handled correctly and that any upstream resolvers are configured robustly.

.:53 {
    errors
    health {
        lameduck 5s # Give it 5 seconds to shut down gracefully
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}

The health and ready plugins within the Corefile expose endpoints (/health and /ready by default on port 8080) that the Kubernetes probes can check. The loadbalance plugin is not for distributing incoming requests from your application pods; it’s for distributing outgoing requests from CoreDNS itself when it needs to forward queries to upstream resolvers. This is a common point of confusion.

When a CoreDNS pod receives a query for an external domain (e.g., google.com), and it’s configured to forward to upstream resolvers, the loadbalance plugin within that specific CoreDNS instance will round-robin between the upstream servers listed in /etc/resolv.conf (or wherever you’ve defined them). It does not influence how Kubernetes directs traffic to your CoreDNS pods.

The prometheus plugin exposes metrics on port 9153, which you can scrape for monitoring. The cache plugin helps reduce load on upstream resolvers by caching DNS records. reload allows CoreDNS to pick up changes to its Corefile without restarting.

The most surprising aspect of CoreDNS HA on Kubernetes is how little CoreDNS itself is involved in the "high availability" aspect of serving your cluster’s DNS queries. Kubernetes handles the distribution and health checking of the CoreDNS pods. CoreDNS’s internal loadbalance plugin is for its own outbound forwarding, not for inbound request distribution.

If you want to see how Kubernetes directs traffic, you can use kubectl exec into a pod and run dig against the kube-dns service IP. You’ll see it resolves consistently. Then, if you were to kill one CoreDNS pod, dig would continue to work, and you’d see the traffic switch to the remaining healthy pods by observing changes in CoreDNS’s request logs or Prometheus metrics.

The next challenge you’ll likely encounter is optimizing CoreDNS performance and understanding its advanced configuration options, particularly around caching and upstream forwarding strategies.

Want structured learning?

Take the full Coredns course →