The Envoy Admin API is a powerful tool for debugging and understanding your Envoy proxy’s runtime behavior, but its most surprising utility comes not from inspecting static configurations, but from observing the dynamic evolution of those configurations and the real-time performance metrics that indicate how well they’re being applied.
Let’s say you have an Envoy instance running, and you want to see what it’s actually doing right now, not just what you think you configured. The admin API is exposed on a separate port, typically 9901, and you can access it via curl.
Here’s how you might check the current listeners Envoy is managing:
curl -s http://127.0.0.1:9901/listeners | jq
This will output a JSON array of all configured listeners. Each listener entry shows the address it’s bound to, the filters it’s using, and crucially, its api_listener status if applicable. This is where you see if your HTTP_CONNECTION_MANAGER or other filter chains are active and what ports Envoy is truly listening on.
Now, let’s dive into the configuration itself. While you can see the applied configuration in the /listeners or /clusters endpoints, it’s often more useful to see the raw configuration that Envoy loaded.
curl -s http://127.0.0.1:9901/config_dump | jq '.configs[] | select(.@type == "type.googleapis.com/envoy.config.listener.v3.Listener")'
This command dumps all loaded configuration objects. By filtering for type.googleapis.com/envoy.config.listener.v3.Listener, you get a snapshot of your listener configurations as Envoy sees them. You can do similar dumps for clusters, routes, and endpoints. This is invaluable for verifying that your dynamic configuration sources (like xDS) are actually pushing the intended configuration to Envoy. You’ll see the exact filter chains, their configurations, and any associated RDS configurations.
Beyond configuration, the real-time statistics are where Envoy shines. The /stats endpoint is a firehose of operational data.
curl -s http://127.0.0.1:9901/stats | grep http.ingress_http.downstream_cx_total
This command will show you the total number of HTTP connections accepted by the ingress_http listener. Envoy organizes stats with a clear naming convention, making it easy to find metrics related to specific listeners, clusters, or filter chains. For example, cluster.<cluster_name>.upstream_cx_connect_fail tells you how many times Envoy failed to establish a connection to an upstream host in that cluster.
To understand the performance of a specific upstream cluster, you might look at:
curl -s http://127.0.0.1:9901/stats?filter=cluster.my_backend_cluster.
This filters the stats to only show those related to my_backend_cluster. You’ll see latency metrics (upstream_cx_time), request counts (upstream_rq_total), response codes (upstream_rp_2xx, upstream_rp_5xx), and much more. This allows you to pinpoint bottlenecks or identify unhealthy upstream services directly from Envoy’s perspective.
When dealing with dynamic routing, the /routes endpoint is your friend.
curl -s http://127.0.0.1:9901/routes | jq '.routes[] | select(.route_name == "my_specific_route")'
This shows you the active route configurations. You can inspect the match conditions, the action taken (e.g., route to a specific cluster, trigger a redirect), and any rate limiting or other policies applied. This is critical for debugging why a request might be going to the wrong upstream or not being handled as expected.
A common, yet often overlooked, aspect of Envoy’s internal state is the health_check/failure_active statistic. When Envoy performs active health checks against upstream hosts, it tracks how many of those checks are currently failing for each host. Seeing a non-zero value here, especially consistently, is a strong indicator that the upstream service is unhealthy, even if it’s still responding to some requests. The full path might look like cluster.my_backend_cluster.health_check.cx_no_health_status.
The GET /clusters endpoint is also crucial for understanding upstream health from Envoy’s perspective. It doesn’t just list clusters; it shows the state of each host within those clusters, including whether they are considered healthy, unhealthy, or in a connection draining state. This gives you a real-time view of Envoy’s understanding of your backend infrastructure’s availability.
Finally, after you’ve fixed all your configuration issues and ensured your upstream services are healthy, the next problem you’ll likely encounter is understanding how Envoy is handling TLS termination and certificate management, which can be inspected via the /certs and related statistics.