Envoy’s ability to terminate and initiate TLS connections means it can act as both a server and a client, making it a natural fit for mTLS.
Let’s see how two Envoy instances can establish mutual TLS.
Imagine we have two services, serviceA and serviceB, each running behind an Envoy proxy.
# envoy.yaml for serviceA
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filter_chain_match:
transport_protocol_match: "tls"
filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_https
codec_type: AUTO
route_config:
virtual_hosts:
- name: serviceA_vhost
routes:
- match:
prefix: "/"
route:
cluster: serviceB_cluster
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/etc/envoy/certs/serviceA.crt"
private_key:
filename: "/etc/envoy/certs/serviceA.key"
alpn_protocols: ["envoy"]
validation_context:
trusted_ca:
filename: "/etc/envoy/certs/serviceB.crt"
clusters:
- name: serviceB_cluster
connect_timeout: 0.25s
type: LOGICAL_DNS
lb_policy: ROUND_ROBIN
dns_lookup_family: V4_ONLY
load_assignment:
cluster_name: serviceB_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: "serviceB" # Hostname of serviceB
port_value: 9090
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/etc/envoy/certs/serviceA.crt"
private_key:
filename: "/etc/envoy/certs/serviceA.key"
alpn_protocols: ["envoy"]
validation_context:
trusted_ca:
filename: "/etc/envoy/certs/serviceB.crt"
The serviceA Envoy is configured to listen on port 8080 for TLS traffic. The DownstreamTlsContext specifies serviceA.crt and serviceA.key for its own identity. Crucially, its validation_context is configured with serviceB.crt as the trusted_ca. This tells serviceA’s Envoy that any client presenting a certificate signed by serviceB.crt (or a CA that signed serviceB.crt) should be trusted.
When serviceA’s Envoy needs to connect to serviceB, it uses the serviceB_cluster. The UpstreamTlsContext here is configured with serviceA.crt and serviceA.key for its identity. This is the certificate serviceA will present to serviceB. Its validation_context is configured with serviceB.crt as the trusted_ca, meaning it expects serviceB to present a certificate signed by serviceB.crt.
The serviceB Envoy configuration would be symmetrical:
# envoy.yaml for serviceB
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 9090
filter_chains:
- filter_chain_match:
transport_protocol_match: "tls"
filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_https
codec_type: AUTO
route_config:
virtual_hosts:
- name: serviceB_vhost
routes:
- match:
prefix: "/"
route:
cluster: serviceA_cluster # This would actually point to serviceA if serviceB initiated
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/etc/envoy/certs/serviceB.crt"
private_key:
filename: "/etc/envoy/certs/serviceB.key"
alpn_protocols: ["envoy"]
validation_context:
trusted_ca:
filename: "/etc/envoy/certs/serviceA.crt"
clusters:
- name: serviceA_cluster # For demonstration, if serviceB initiated
connect_timeout: 0.25s
type: LOGICAL_DNS
lb_policy: ROUND_ROBIN
dns_lookup_family: V4_ONLY
load_assignment:
cluster_name: serviceA_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: "serviceA" # Hostname of serviceA
port_value: 8080
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/etc/envoy/certs/serviceB.crt"
private_key:
filename: "/etc/envoy/certs/serviceB.key"
alpn_protocols: ["envoy"]
validation_context:
trusted_ca:
filename: "/etc/envoy/certs/serviceA.crt"
The core of mTLS here is the exchange: serviceA’s Envoy, when connecting to serviceB, presents its serviceA.crt and expects serviceB to present a certificate validatable against serviceA’s trusted_ca (which is serviceB.crt). Conversely, serviceB’s Envoy, when acting as a server, presents serviceB.crt and expects clients to present certificates validatable against its trusted_ca (which is serviceA.crt). This bidirectional trust ensures that both parties in the communication are authenticated.
The alpn_protocols: ["envoy"] is an important detail. While not strictly required for mTLS itself, it’s a common practice with Envoy-to-Envoy communication. It allows the proxies to negotiate a specific protocol, ensuring they are speaking the same "language" over the TLS connection, which can be beneficial for performance and feature compatibility. If this were omitted, Envoy would fall back to standard TLS negotiation.
The validation_context is where the actual trust is established. By specifying trusted_ca, you’re telling Envoy which certificate authorities (or directly, which root/intermediate certificates) it should trust when validating the peer’s certificate. In this direct mTLS setup, each service’s certificate is used to validate the other’s. For more complex setups with a dedicated CA, you’d place the CA’s certificate here.
The certificate_chain refers to the public certificate, and private_key refers to the corresponding private key. These are used by the Envoy proxy to prove its identity to the peer.
The service directive in the address field of the load_assignment for serviceB_cluster implies that DNS resolution will occur. If you were using static IP addresses or a service discovery mechanism that provides IPs directly, you would replace "serviceB" with the appropriate IP address.
The connect_timeout of 0.25s is a common default, meaning Envoy will wait for a quarter of a second for a connection to be established to the upstream cluster. If this timeout is too short for your network conditions, you might see upstream connection errors.
When Envoy acts as a downstream server (the DownstreamTlsContext), the validation_context is used to verify the client’s certificate. When Envoy acts as an upstream client (the UpstreamTlsContext), the validation_context is used to verify the server’s certificate. In mTLS, both contexts are configured to validate the peer.
The most surprising truth about mTLS between Envoy proxies is that you can often configure it using just the public certificates of each service as the trusted CA for the other. This creates a tight, direct, and self-contained security boundary without needing a separate, external Certificate Authority for simple service-to-service interactions.
This setup provides strong authentication for both the client and the server. No request can reach serviceB unless it comes from an Envoy that presents a certificate signed by serviceA.crt (or a CA that signed serviceA.crt), and no request can be sent by serviceA to serviceB unless serviceB presents a certificate signed by serviceB.crt (or a CA that signed serviceB.crt).
The next step you might encounter is managing certificate rotation, as these certificates will eventually expire.