Cilium can operate as a service mesh without injecting Envoy sidecars into your application pods.
Let’s see Cilium’s service mesh capabilities in action.
Imagine you have a simple Kubernetes cluster with two applications: frontend and backend.
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
spec:
replicas: 1
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: nginxdemos/hello:plain-text
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: frontend
spec:
selector:
app: frontend
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
spec:
replicas: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: backend
image: nginxdemos/hello:plain-text
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: backend
spec:
selector:
app: backend
ports:
- protocol: TCP
port: 80
targetPort: 80
Normally, to enable service mesh features like mTLS or traffic routing, you’d use a tool like Istio or Linkerd, which involves injecting an Envoy proxy as a sidecar container into each of your application pods. This adds overhead and complexity.
Cilium’s service mesh mode bypasses this. Instead of sidecars, Cilium leverages its eBPF-powered Data Plane. When Cilium is installed in service mesh mode, it programs eBPF programs directly into the Linux kernel of your nodes. These eBPF programs intercept network traffic at the kernel level before it even reaches your application pods or after it leaves.
Here’s how it works:
-
Installation: You install Cilium with the
enable-servicemeshflag set totrue. This configures Cilium to manage service mesh policies.helm install cilium cilium/cilium --version <cilium-version> \ --namespace kube-system \ --set enable-servicemesh=true \ --set ingressController.enabled=false \ --set kubeProxyReplacement=strict \ --set autoDirectNodeRoutes=true \ --set cluster.name=my-cluster \ --set ipam.mode=kubernetes(Note: Specific Helm values will vary based on your cluster setup and Cilium version.
kubeProxyReplacement=strictis often recommended for full control.) -
ServiceMesh Policy: You define
CiliumServiceMeshPolicyresources to dictate service mesh behavior, such as mTLS.apiVersion: cilium.io/v2alpha1 kind: CiliumServiceMeshPolicy metadata: name: backend-mtls spec: target: namespace: default selector: matchLabels: app: backend mtls: mode: STRICTThis policy enforces Strict mTLS for all pods with the label
app: backendin thedefaultnamespace. -
eBPF Enforcement: Cilium’s agent on each node, running in service mesh mode, reads these policies. It then compiles them into eBPF programs and attaches them to network interfaces (like
vethpairs connected to pods, or the host’s network interface). When traffic destined for or originating from thebackendpods arrives at the node’s kernel, the eBPF program intercepts it.- Ingress Traffic: If incoming traffic isn’t TLS-encrypted with a valid client certificate recognized by Cilium, the eBPF program drops it.
- Egress Traffic: If an application tries to send unencrypted traffic to a
backendservice that requires mTLS, the eBPF program intercepts it, encrypts it using the pod’s identity certificate, and then forwards the encrypted traffic.
The key is that this happens within the kernel, before the packet is processed by the user-space network stack or reaches the application’s socket. This eliminates the need for a separate Envoy process running in each pod, reducing resource consumption and latency.
To illustrate the mTLS in action, let’s try to access the backend service from the frontend service.
First, apply the CiliumServiceMeshPolicy above.
Then, exec into the frontend pod and try to curl the backend service:
kubectl exec -it <frontend-pod-name> -- curl -v http://backend:80
Without the CiliumServiceMeshPolicy, this would likely succeed. With the STRICT mTLS policy in place, and assuming you haven’t configured any certificates for the frontend to present, this curl command will fail with a TLS handshake error (e.g., curl: (35) Recv failure: inappropriate ioctl for device). The eBPF program intercepts the outbound request, sees that the backend requires mTLS, and because the frontend pod doesn’t have a valid certificate (or isn’t configured to use one), the connection is terminated at the kernel level.
If you were to configure the frontend application or its pod with a client certificate recognized by Cilium, the connection would succeed, and you’d see the backend’s response.
Cilium’s service mesh mode also handles advanced traffic management features like:
- Traffic Splitting: Gradually rolling out new versions by directing a percentage of traffic to a new deployment.
- Retries and Timeouts: Configuring resilience patterns for inter-service communication.
- Circuit Breaking: Automatically stopping traffic to unhealthy services.
These are all configured via CiliumNetworkPolicy and CiliumServiceMeshPolicy resources, and the enforcement logic resides in the eBPF programs.
The one thing most people don’t realize is that Cilium’s service mesh mode doesn’t just intercept traffic; it can also generate the necessary TLS certificates dynamically based on Kubernetes Service Accounts and identity. This means you can enable mTLS without manually managing certificate authorities or distributing certificates. Cilium acts as a local CA for your cluster’s services, signing certificates for each pod’s identity, which are then used by the eBPF programs for secure communication.
With all service mesh policies enforced, the next challenge often involves debugging why a specific policy isn’t being applied as expected, leading you to explore Cilium’s policy debugging tools.