Envoy’s debug logging is surprisingly quiet by default, and you have to ask for it specifically.

Let’s see what a typical request looks like when Envoy is spitting out debug logs. Imagine a user hitting your service through Envoy, and Envoy is acting up. You’ve scaled up the logging, and now you’re seeing lines like this in your Envoy logs:

[2023-10-27 10:00:00.123] [debug][http] [source/common/http/conn_manager_impl.cc:1560] [C0x1234567890] request received: headers={:method=>"GET", :path=>"/users/123", :authority=>"example.com", :scheme=>"https", x-request-id=>"abcdef1234567890"}
[2023-10-27 10:00:00.124] [debug][http] [source/common/http/conn_manager_impl.cc:1038] [C0x1234567890] request is not filtered out
[2023-10-27 10:00:00.125] [debug][router] [source/common/router/router_impl.cc:380] [C0x1234567890] request to cluster 'users_service'
[2023-10-27 10:00:00.126] [debug][router] [source/common/router/router_impl.cc:380] [C0x1234567890] request to cluster 'users_service'
[2023-10-27 10:00:00.127] [debug][upstream] [source/common/upstream/cluster_manager_impl.cc:234] [C0x1234567890] selected new connection for cluster 'users_service'
[2023-10-27 10:00:00.128] [debug][http] [source/common/http/http1_parser.cc:134] [C0x1234567890] Sending request headers to upstream: {:method=>"GET", :path=>"/users/123", :authority=>"example.com", :scheme=>"https", x-request-id=>"abcdef1234567890"}
[2023-10-27 10:00:00.200] [debug][http] [source/common/http/conn_manager_impl.cc:1670] [C0x1234567890] response received from upstream: headers={:status=>"200", content-type=>"application/json", x-request-id=>"abcdef1234567890"}
[2023-10-27 10:00:00.201] [debug][router] [source/common/router/router_impl.cc:1120] [C0x1234567890] response from cluster 'users_service'
[2023-10-27 10:00:00.202] [debug][http] [source/common/http/conn_manager_impl.cc:1038] [C0x1234567890] response is not filtered out
[2023-10-27 10:00:00.203] [debug][http] [source/common/http/conn_manager_impl.cc:1670] [C0x1234567890] response headers={:status=>"200", content-type=>"application/json", x-request-id=>"abcdef1234567890"}

This isn’t just noise; it’s a step-by-step account of what Envoy is doing. You can see it receiving the request, deciding which cluster to send it to, selecting a connection, sending the request headers, and finally, receiving the response. This granular view is what helps you pinpoint where things are going wrong.

The core problem Envoy is designed to solve is acting as a high-performance, extensible proxy. It handles routing, load balancing, service discovery, health checking, and more, all while providing observability. When it breaks, it’s usually because one of these functions isn’t behaving as expected, or a configuration aspect is misaligned with the actual network state.

To understand Envoy’s behavior, you need to know its configuration structure. It’s typically defined by a large JSON or YAML file, broken down into static_resources (things that don’t change, like clusters and listeners) and dynamic_resources (things that can be updated, like routes, endpoints, and secrets). Within static_resources, you’ll find listeners which define how Envoy listens for incoming connections, and clusters which define the upstream services Envoy can connect to.

The key to unlocking debug logging is the admin configuration, specifically the access_log settings and the runtime configuration. By default, Envoy logs at info level. To get debug level, you need to explicitly tell it to increase verbosity.

Here’s how you enable it:

First, you need to ensure your Envoy configuration has an admin section. If it doesn’t, add one. This section allows you to interact with the running Envoy process, including changing logging levels.

admin:
  access_log_path: "/dev/stdout"
  address:
    socket_address:
      address: "0.0.0.0"
      port_value: 9901

Next, you need to enable runtime configuration overrides. This is done via the runtime field in your Envoy configuration. If you don’t have it, add it.

runtime:
  virtual:
    - name: envoy.reloadable_features.use_runtime
      enabled: true

With the admin interface running and runtime enabled, you can now dynamically change the logging level. You’ll typically do this via curl to the admin port (defaulting to 9901 in the example above). To set the log level for all components to debug, you’d use:

curl -X POST "http://localhost:9901/logging" -d '{"level": "debug"}'

This command tells the running Envoy process to set its global logging level to debug. You can also set specific component log levels if you know exactly where to look, for example:

curl -X POST "http://localhost:9901/logging" -d '{"component_log_level": {"name": "http", "level": "debug"}}'

This command targets only the http component. The name can be any of Envoy’s internal components (e.g., router, upstream, secret_load).

Once enabled, the debug logs will start appearing in the location specified by access_log_path in your admin configuration. If you’re using Kubernetes, this is often /dev/stdout or /dev/stderr, which gets captured by your container runtime and logs aggregator.

The most surprising thing about Envoy’s debug logging is that it doesn’t just log what it does, but why it makes certain decisions, including the specific configuration values it’s using at that moment. For example, when selecting an upstream cluster, a debug log might show the exact weight, priority, and load balancing metric that led to that choice, along with the health status of the available endpoints.

When you’re debugging, you’ll often find yourself looking at logs related to http, router, and upstream. The http logs show the request and response lifecycle from Envoy’s perspective as a connection manager. The router logs detail how Envoy decides which upstream cluster to send the request to, including header manipulation and route matching. The upstream logs cover the specifics of establishing connections to your backend services and the health of those connections.

The one thing most people don’t know is that you can also disable specific runtime features via the admin interface, not just increase logging. For instance, if you suspect a particular dynamic feature like TLS inspection is causing issues, you can temporarily disable it to see if the problem resolves. You’d do this with a command like:

curl -X POST "http://localhost:9901/runtime_modify" -d '{"op": "delete", "key": "envoy.reloadable_features.tls_inspection"}'

This lets you surgically disable parts of Envoy’s functionality to isolate the root cause of a problem, which is incredibly powerful for complex issues.

The next step after enabling debug logging and observing the output is usually to correlate these logs with metrics from your backend services.

Want structured learning?

Take the full Envoy course →