Envoy doesn’t just sit there and watch; it actively rewrites your application’s network requests to enforce policies and gather telemetry.

Let’s see how this plays out. Imagine you have a simple curl command hitting a local service running on localhost:8080.

curl http://localhost:8080/hello

Now, let’s put Envoy in the middle. Your application still listens on localhost:8080, but curl now talks to Envoy, typically on localhost:9901 (for admin) and localhost:8001 (for actual traffic).

# Assuming Envoy is configured to listen on 8001 for incoming traffic
curl http://localhost:8001/hello

Envoy intercepts this request. It doesn’t just forward it; it can:

  1. Apply Policies: Check if the request is allowed based on its configuration (e.g., rate limiting, access control lists). If it’s not allowed, Envoy returns an error before it even reaches your application.
  2. Transform Requests: Add or modify headers. For example, it might add x-request-id for tracing, or inject authentication tokens.
  3. Route Dynamically: Decide which instance of your service to send the request to based on sophisticated routing rules (e.g., A/B testing, canary deployments).
  4. Gather Telemetry: Record metrics like request duration, response code, bytes sent/received, and send them to a monitoring system.
  5. Handle Resilience: Implement circuit breakers, retries, and timeouts, abstracting these concerns away from your application.

Finally, Envoy forwards the (potentially transformed) request to your actual application, which is still listening on localhost:8080. The response from your application travels back through Envoy, which can again inspect, transform, or record metrics before returning it to the original client (curl).

This whole process means your application code only needs to worry about its core business logic. It doesn’t need to implement retries for flaky downstream services, generate unique request IDs for tracing, or enforce complex routing logic. Envoy handles all of that external, cross-cutting network concerns.

The core problem Envoy as a sidecar solves is the separation of concerns for network-related functionalities. Instead of embedding complex networking logic into every microservice (which leads to duplicated effort, inconsistent implementations, and difficult maintenance), you delegate these responsibilities to a dedicated, highly optimized proxy that runs alongside each service instance. This allows your application code to be leaner, more focused, and easier to reason about.

Consider the configuration for Envoy. A minimal configuration might look something like this (in YAML, often loaded via the admin API or a bootstrap file):

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 8001
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: cluster_0
          http_filters:
          - name: envoy.filters.http.router
            typed_config: {}
  clusters:
  - name: cluster_0
    connect_timeout: 0.25s
    type: LOGICAL_DNS
    lb_policy: ROUND_ROBIN
    # This is the crucial part: where Envoy sends traffic *after* intercepting it.
    # It's the address of your actual application.
    hosts:
    - socket_address:
        address: 127.0.0.1
        port_value: 8080

In this configuration, Envoy is told to listen on port 8001. Any HTTP request arriving on 8001 is processed by the http_connection_manager. The router filter then looks at the cluster_0 configuration. cluster_0 is defined to resolve LOGICAL_DNS (which, in this simple case, means it will resolve the hostname provided, here 127.0.0.1 to itself) and send traffic to 127.0.0.1:8080. So, a request to localhost:8001/some/path is transformed by Envoy and sent to your application listening on localhost:8080/some/path.

The most surprising thing most people don’t grasp immediately is that the "service discovery" part of LOGICAL_DNS is a bit of a misnomer here. When Envoy is deployed as a sidecar in an environment like Kubernetes, the hosts entry for cluster_0 typically points to the Kubernetes service name (e.g., my-app-service.my-namespace.svc.cluster.local). Envoy then uses Kubernetes’s DNS to resolve that service name to the actual IP addresses of the running pods. However, the connect_timeout and lb_policy (like ROUND_ROBIN or LEAST_REQUEST) are actually Envoy’s internal mechanisms for managing connections and distributing load before it even hits the application pod. It’s not just a simple DNS lookup; Envoy maintains a pool of connections to the upstream endpoints and actively manages their health and load.

The next logical step is to see how Envoy handles more complex routing scenarios, such as weighted routing for canary deployments.

Want structured learning?

Take the full Envoy course →