Envoy’s no healthy upstream for given host error means that Envoy, acting as a proxy, couldn’t find any healthy instances of the service you’re trying to reach.

Here’s a breakdown of common causes and how to fix them:

1. Service Not Registered with Discovery Service (e.g., Kubernetes, Consul, Eureka)

Diagnosis: Check your service discovery system to ensure the target service and its healthy instances are actually registered.

  • Kubernetes: kubectl get endpoints <your-service-name> -n <your-namespace>
    • Look for an ENDPOINTS section with IP addresses. If it’s empty or <none>, the service isn’t discovering healthy pods.
  • Consul: consul services or check the Consul UI.
  • Eureka: Check the Eureka dashboard for registered services and their health status.

Fix:

  • Kubernetes: Ensure your service definition (Service object) correctly selects the pods that are running your application. Verify that the pods themselves are healthy (e.g., kubectl get pods -n <your-namespace> -l app=<your-app-label>). If pods are unhealthy, fix the pod issues (e.g., restart them, fix application errors). If the Service selector is wrong, update it.
  • Consul/Eureka: Ensure your service registration mechanism is correctly configured to report healthy instances. This might involve fixing the health check configuration for your service or ensuring the registration agent is running and healthy.

Why it works: Envoy relies on a discovery service to know where to send traffic. If the service discovery system doesn’t know about your service or marks all its instances as unhealthy, Envoy has no backend to route to, hence the error.

2. Incorrect Cluster.name in Envoy Configuration

Diagnosis: Examine your Envoy clusters.yaml (or equivalent configuration) and compare the name field of the relevant cluster definition against the hostname Envoy is being asked to resolve.

  • Look for a line like: name: "my-backend-service" in your clusters.yaml.
  • Then, look at the request Envoy is receiving. If the request’s Host header is api.example.com, Envoy will try to find a cluster named api.example.com. If your cluster is named my-backend-service, it won’t match.

Fix: Ensure the name field in your Envoy cluster configuration exactly matches the hostname Envoy is expecting or the hostname being presented in the incoming request’s Host header. Often, this means setting name to the DNS name of your backend service.

  • Example Fix: If Envoy receives a request with Host: api.internal.net, and you want to route it to a backend service, your cluster definition should look like:
    clusters:
    - name: api.internal.net
      connect_timeout: 0.25s
      type: STRICT_DNS
      dns_refresh_rate: 1s
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: api.internal.net
        endpoints:
        - lb_endpoints:
          - endpoint:
              address:
                socket_address:
                  address: "api.internal.net" # Envoy will resolve this DNS name
                  port_value: 8080
    

Why it works: Envoy uses the cluster name as a key to look up backend configurations. If there’s no cluster with that specific name, it cannot proceed.

3. Envoy type: STRICT_DNS and DNS Resolution Issues

Diagnosis: If your cluster type is STRICT_DNS, Envoy will perform DNS lookups for the hostname specified in load_assignment.endpoints[0].lb_endpoints[0].endpoint.address.socket_address.address.

  • From the Envoy proxy itself, try resolving the hostname:
    • kubectl exec -it <envoy-pod-name> -n <envoy-namespace> -- nslookup <backend-hostname>
    • kubectl exec -it <envoy-pod-name> -n <envoy-namespace> -- dig <backend-hostname>
  • If these commands fail or return no A records, Envoy won’t be able to find an IP address.

Fix:

  • Ensure the DNS server Envoy is configured to use is functional and can resolve the backend service’s hostname.
  • Verify that the address field in your load_assignment correctly specifies the DNS name that should resolve to your backend service.
  • If using Kubernetes Service objects, STRICT_DNS typically works by resolving the Kubernetes service DNS name (e.g., <service-name>.<namespace>.svc.cluster.local). Ensure this name is correct and that the Kubernetes DNS (like CoreDNS) is healthy.

Why it works: STRICT_DNS requires Envoy to actively resolve the provided hostname to an IP address. If that resolution fails, Envoy has no endpoints to connect to.

4. Incorrect hosts in load_assignment (for STATIC or ORIGINAL_DST types)

Diagnosis: If your cluster type is STATIC or ORIGINAL_DST, Envoy doesn’t use DNS discovery. It relies on hardcoded endpoints or dynamically inferred ones.

  • STATIC: Check the load_assignment.endpoints section for the correct IP addresses and ports of your backend services.
    • Example:
      clusters:
      - name: my-static-backend
        type: STATIC
        load_assignment:
          cluster_name: my-static-backend
          endpoints:
          - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: "10.0.1.5" # Hardcoded IP
                    port_value: 8080
      
    • Verify the IPs and ports are reachable from the Envoy pod. kubectl exec -it <envoy-pod-name> -n <envoy-namespace> -- ping 10.0.1.5 (if ping is available) or telnet 10.0.1.5 8080.
  • ORIGINAL_DST: This type relies on the kernel’s routing table to determine the destination IP. Ensure your Envoy setup correctly configures the kernel for this. This is less common for direct "no cluster found" errors but can happen if the network setup is broken.

Fix:

  • STATIC: Update the address and port_value in your load_assignment to point to the correct, reachable IP addresses and ports of your backend services.
  • ORIGINAL_DST: Troubleshoot your network configuration, iptables rules, and Envoy’s original_dst_listener setup. This often involves ensuring traffic is correctly DNATed before reaching Envoy.

Why it works: For STATIC clusters, Envoy only knows about the IPs you’ve explicitly listed. If those IPs are wrong or unreachable, it has no backends. ORIGINAL_DST relies on the OS’s network stack to have done its job correctly.

5. Health Check Failures

Diagnosis: Envoy performs health checks on upstream clusters by default. If all instances in a cluster fail their health checks, Envoy will temporarily remove them from the active pool, leading to this error.

  • Check Envoy’s logs for health check failure messages.
  • Look at the Envoy Admin API for cluster and endpoint health status:
    • curl http://127.0.0.1:9901/clusters?format=json
    • curl http://127.0.0.1:9901/clusters/my-backend-cluster?format=json
    • Look for healthy_active endpoints. If it’s 0, all are unhealthy.

Fix: Investigate why the health checks are failing.

  • Application Issues: The backend service might be crashing, overloaded, or not responding correctly to its health endpoint.
  • Network Issues: Firewalls or network policies might be blocking Envoy’s health check probes to the backend service.
  • Health Check Configuration: The health check configuration in Envoy might be too aggressive (e.g., too short a timeout, too few retries) or pointing to the wrong health endpoint. Adjust interval, timeout, unhealthy_threshold, healthy_threshold in your cluster’s health_checks section.
    clusters:
    - name: my-backend-service
      # ... other config ...
      health_checks:
      - timeout: 0.5s
        interval: 5s
        unhealthy_threshold: 3
        healthy_threshold: 2
        http_health_check:
          path: "/healthz"
          # ... other http options ...
    

Why it works: Envoy is designed to be resilient. It will stop sending traffic to endpoints that are not responding to health checks to avoid sending requests to dead services. If all endpoints fail, you get this error.

6. Incorrect Host Header Rewriting or Mismatch in VirtualHost

Diagnosis: Envoy routes based on the incoming request’s Host header (or other matching criteria) to a VirtualHost, which then selects a Cluster. If the Host header is modified incorrectly before Envoy or if your VirtualHost’s domains don’t match what’s being sent, Envoy might not find the correct VirtualHost and thus no cluster.

  • Check the domains in your http_connection_manager.route_config.virtual_hosts.
    • Example: domains: ["api.example.com"]
  • Inspect the actual Host header arriving at Envoy. Use tcpdump on the Envoy pod or check upstream logs if Envoy is forwarding the header.

Fix:

  • Ensure the domains list in your VirtualHost configuration includes the exact hostname that clients are sending in their Host header.
  • If you have a load balancer in front of Envoy, ensure it’s not stripping or altering the Host header in a way that breaks the match.
  • If you’re using host_rewrite in your routes, ensure it’s configured correctly to rewrite the incoming Host header to match a domain in a VirtualHost or to match the name of a Cluster if cluster_specifier_plugin is used.

Why it works: The VirtualHost acts as a primary dispatcher. If the incoming request’s Host doesn’t match any domains defined for the VirtualHost Envoy is using, it cannot proceed to select a cluster.

The next error you’ll likely encounter, after fixing this, is a 503 Service Unavailable if Envoy can find the cluster but none of the individual endpoints within it are healthy or reachable.

Want structured learning?

Take the full Envoy course →