Scrape Cilium Metrics with Prometheus (2026)

Cilium’s metrics are designed to be scraped by Prometheus, but Prometheus can’t magically find them without explicit configuration.

Here’s how it works:

Cilium components, like the agent (cilium-agent) and the operator (cilium-operator), expose metrics on specific HTTP endpoints. Prometheus, a time-series database and monitoring system, needs to be told where to find these endpoints and how often to collect data from them.

Let’s see Cilium’s metrics in action. Imagine you have a Kubernetes cluster with Cilium installed. The cilium-agent pod, running on each node, is where most of the action happens. It manages network policies, IP address management, and the underlying network connectivity. It exposes its metrics on port 9090 (the default, but can be configured).

Here’s a snippet of what you might see if you curl that endpoint directly (assuming you have port-forwarded or have network access):

# HELP cilium_agent_cgroup_memory_bytes_total cgroup memory usage in bytes
# TYPE cilium_agent_cgroup_memory_bytes_total gauge
cilium_agent_cgroup_memory_bytes_total{cgroup="kubepods",namespace="default",pod="my-app-pod",container="my-app-container"} 1.2345e+08
# HELP cilium_bpf_map_entry_count number of entries in a BPF map
# TYPE cilium_bpf_map_entry_count gauge
cilium_bpf_map_entry_count{mapname="cilium_ipcache",node="worker-node-1"} 1500
# HELP cilium_network_policy_violations_total total number of network policy violations
# TYPE cilium_network_policy_violations_total counter
cilium_network_policy_violations_total{direction="ingress",from_identity="123",to_identity="456",rule_name="deny-all"} 5

Prometheus needs to be configured to scrape these. This is typically done via a Service and ServiceMonitor (if you’re using the Prometheus Operator) or a direct scrape_configs entry in your Prometheus configuration.

The ServiceMonitor Approach (with Prometheus Operator):

This is the most common and Kubernetes-native way.

Create a Service for Cilium’s metrics: You need a Kubernetes Service that points to the Cilium agent pods and exposes the metrics port.

apiVersion: v1
kind: Service
metadata:
  name: cilium-metrics
  namespace: kube-system # Or wherever Cilium is installed
  labels:
    app: cilium
    release: cilium # If using Helm, this might match your release name
spec:
  selector:
    io.kubernetes.pod.name: cilium-agent-xyz # This needs to match your cilium-agent pods
  ports:
  - name: metrics
    port: 9090 # The port Cilium agent exposes metrics on
    targetPort: 9090

Self-correction: The selector above is too specific. You want to select all cilium-agent pods. A better selector uses labels common to Cilium agent pods:

apiVersion: v1
kind: Service
metadata:
  name: cilium-metrics
  namespace: kube-system # Or wherever Cilium is installed
  labels:
    app: cilium
spec:
  selector:
    k8s-app: cilium # This label is usually present on Cilium agent pods
  ports:
  - name: metrics
    port: 9090
    targetPort: 9090

Create a ServiceMonitor: This tells the Prometheus Operator which Service to monitor.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cilium
  namespace: monitoring # The namespace where Prometheus is running
  labels:
    release: prometheus # Label used by Prometheus Operator to discover ServiceMonitors
spec:
  selector:
    matchLabels:
      app: cilium # This must match the labels on your cilium-agent Service
  namespaceSelector:
    matchNames:
    - kube-system # The namespace where your Cilium agent Service is
  endpoints:
  - port: metrics # Matches the 'name' in your Service's ports
    interval: 30s # How often to scrape
    path: /metrics # The default metrics path for Cilium

The Direct Prometheus Configuration Approach:

If you’re not using the Prometheus Operator, you’ll manually add scrape configurations to your prometheus.yml.

Find Cilium Agent Pods: You need to dynamically discover the Cilium agent pods. Kubernetes service discovery is perfect for this.

Add to prometheus.yml:

scrape_configs:
  - job_name: 'cilium-agent'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      # Only scrape pods with the 'k8s-app: cilium' label
      - source_labels: [__meta_kubernetes_pod_label_k8s_app]
        action: keep
        regex: cilium
      # Relabel the pod name to be the instance name
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: instance
      # Ensure we scrape the correct port and path
      - source_labels: [__meta_kubernetes_pod_container_port_name]
        action: keep
        regex: metrics
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (\d+)
      # If no specific port annotation, use the default Cilium metrics port
      - action: replace
        target_label: __address__
        regex: .*
        source_labels: [__meta_kubernetes_pod_name, __meta_kubernetes_namespace]
        replacement: ${1}.${2}:9090 # This might need adjustment based on your cluster setup

Correction: The replacement: ${1}.${2}:9090 line is problematic. Prometheus’s kubernetes_sd_configs handles service discovery well. A cleaner approach focuses on identifying the correct pods and ports.

A more robust scrape_configs entry:

scrape_configs:
  - job_name: 'cilium-agent'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      # Keep only pods that have the 'k8s-app: cilium' label
      - source_labels: [__meta_kubernetes_pod_label_k8s_app]
        action: keep
        regex: cilium
      # Extract the metrics port. Cilium typically exposes on 9090.
      # If you have an annotation like prometheus.io/port, use that.
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)
        replacement: ${1}:${1} # This assumes port and targetPort are the same
      # If no annotation, default to 9090
      - action: replace
        source_labels: [__address__]
        regex: (.*)
        target_label: __address__
        replacement: ${1}:9090
      # Set the metrics path, usually /metrics
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - action: replace
        target_label: __metrics_path__
        regex: .*
        replacement: /metrics
      # Use pod name as instance label for clarity
      - source_labels: [__meta_kubernetes_pod_name]
        target_label: instance

What problem does this solve? This configuration allows Prometheus to discover and collect detailed operational metrics from your Cilium network components. These metrics provide insights into network policy enforcement, BPF map usage, performance, errors, and the overall health of your cluster’s networking layer.

How it works internally: Cilium components are built with Prometheus client libraries. They expose an HTTP endpoint (usually /metrics) that serves metrics in the Prometheus text format. Prometheus, acting as a client, periodically polls this endpoint. The ServiceMonitor or scrape_configs tell Prometheus which endpoints to poll (based on Kubernetes labels and service discovery) and how often.

The levers you control:

Scrape Interval: How frequently Prometheus fetches metrics (interval in ServiceMonitor, or scrape_interval in Prometheus config). Shorter intervals give more granular data but increase load.
Metrics Path: The HTTP path where metrics are exposed (path in ServiceMonitor, or __metrics_path__ relabeling). Cilium defaults to /metrics.
Port: The network port the metrics are served on (port in Service, or prometheus.io/port annotation).
Selector Labels: The Kubernetes labels used to identify Cilium components for scraping. Consistency here is key.

One thing most people don’t know is that Cilium also exposes metrics for its operator. If you’re using the Prometheus Operator, you’d create a separate Service and ServiceMonitor for the cilium-operator pods, usually listening on port 9091.

The next step is often configuring alerting rules in Prometheus based on these collected Cilium metrics.