Prometheus and Grafana are the de facto standard for metrics-based observability in Kubernetes, but getting them set up on EKS involves more than just running a few Docker images.
Let’s see Prometheus in action, scraping metrics from a sample Nginx deployment on EKS.
First, we need a way to expose metrics from our applications. For Nginx, we can use the nginx-ingress-controller which has built-in Prometheus metrics.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-ingress-controller
namespace: ingress-nginx
spec:
replicas: 2
selector:
matchLabels:
app: ingress-nginx
template:
metadata:
labels:
app: ingress-nginx
spec:
containers:
- name: controller
image: k8s.gcr.io/ingress-nginx/controller:v1.0.0
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: metrics
containerPort: 10254 # Prometheus metrics endpoint
args:
- /nginx-ingress-controller
- --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
- --election-id=ingress-controller-leader
- --controller-class=k8s.io/ingress-nginx
- --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
- --validating-webhook=:8443
- --validating-webhook-certificate=/usr/local/certificates/cert
- --validating-webhook-key=/usr/local/certificates/key
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
Now, Prometheus needs to know where to find these metrics. This is where ServiceMonitor custom resources come in. A ServiceMonitor tells Prometheus which services to scrape, on which ports, and how often.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: nginx-ingress-sm
namespace: monitoring # Prometheus namespace
spec:
selector:
matchLabels:
app: ingress-nginx # Matches labels on the Nginx Ingress Controller Service
namespaceSelector:
matchNames:
- ingress-nginx # Scrapes services in the ingress-nginx namespace
endpoints:
- port: metrics # The name of the port defined in the Nginx deployment
interval: 15s # Scrape every 15 seconds
path: /metrics # The path where Prometheus metrics are exposed
Grafana, on the other hand, is our visualization layer. It pulls data from Prometheus (and other sources) and displays it in dashboards. To connect Grafana to Prometheus, we configure Prometheus as a data source within Grafana.
Here’s a simplified Grafana configuration snippet showing how to add Prometheus as a data source:
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-config
namespace: monitoring
data:
grafana.ini: |
[dataproxy]
listen_port = 3000
[database]
type = sqlite3
path = /var/lib/grafana/grafana.db
[paths]
data = /var/lib/grafana
logs = /var/log/grafana
[auth]
disable_login_form = true
[analytics]
reporting_enabled = false
[plugins]
allow_loading_unsigned_plugins = grafana-piechart-panel
[dashboards]
snapshot_enabled = true
[users]
allow_sign_up = false
auto_assign_org_role = Viewer
auto_assign_org = 1
[log]
mode = console
[server]
root_url = http://localhost:3000
[feature_toggles]
enable_flame_graph = true
---
apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
name: grafana
namespace: monitoring
spec:
config:
adminUser: admin
adminPassword: prom-operator
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus-operated.monitoring.svc:9090 # Prometheus service URL
access: proxy
isDefault: true
The core idea is that Prometheus acts as the collector and time-series database for your metrics, and Grafana is the presentation layer that queries Prometheus to build informative dashboards. The ServiceMonitor is the glue that tells Prometheus what to collect from your Kubernetes services.
When you install Prometheus using the kube-prometheus-stack Helm chart, it automatically creates the necessary ServiceMonitor resources for common Kubernetes components like kube-state-metrics, node-exporter, and the Kubernetes API server. You then create your own ServiceMonitor resources for your custom applications.
One of the most powerful, yet often overlooked, aspects of Prometheus is its service discovery mechanism. Instead of manually configuring targets, Prometheus dynamically discovers them based on Kubernetes labels and annotations. This means that as pods scale up or down, Prometheus automatically starts or stops scraping them without any manual intervention. The ServiceMonitor resource leverages this by defining selectors that match Kubernetes services, and Prometheus then finds the pods associated with those services.
The next logical step is to explore how to alert on these metrics using Prometheus Alertmanager.