Caddy can export metrics to Prometheus, but it doesn’t do so by default; you have to explicitly enable and configure it.
Here’s how you can set up Caddy to expose metrics for Prometheus to scrape:
First, ensure your Caddyfile is configured to enable the metrics endpoint. This is done using the metrics directive within your Caddyfile.
yourdomain.com {
reverse_proxy localhost:8080
metrics {
path /metrics
listen 127.0.0.1:9180
}
}
In this configuration:
path /metrics: This specifies the HTTP path where the metrics will be exposed./metricsis the conventional path for Prometheus endpoints.listen 127.0.0.1:9180: This tells Caddy to bind the metrics endpoint to a specific IP address and port. Binding to127.0.0.1(localhost) means it’s only accessible from the Caddy server itself, which is often a good security practice. You can change this to0.0.0.0:9180if you need to expose it on a network interface, but be mindful of firewall rules.
After updating your Caddyfile, you need to reload Caddy for the changes to take effect. If you’re running Caddy as a service, you can typically do this with:
sudo systemctl reload caddy
Or, if you’re running it manually:
caddy reload --config /path/to/your/Caddyfile
Now, Caddy will be listening on 127.0.0.1:9180 and serving metrics on the /metrics path.
Next, you need to configure Prometheus to scrape these metrics. This involves adding a scrape job to your Prometheus configuration file (prometheus.yml).
scrape_configs:
- job_name: 'caddy'
static_configs:
- targets: ['localhost:9180']
labels:
instance: 'your-caddy-instance-name'
In this Prometheus configuration:
job_name: 'caddy': This assigns a name to this scrape job, which will appear in Prometheus’s time-series data.targets: ['localhost:9180']: This tells Prometheus where to find the Caddy metrics endpoint. If your Caddy server is on a different machine, replacelocalhostwith its IP address or hostname.labels: You can add custom labels to the scraped metrics.instanceis a common label to distinguish between multiple Caddy instances.
After updating your prometheus.yml file, you need to reload Prometheus. This can usually be done by sending a SIGHUP signal to the Prometheus process or by restarting the Prometheus service.
# If running Prometheus directly, find its PID and send SIGHUP
kill -HUP <prometheus_pid>
# Or, if running as a service
sudo systemctl reload prometheus
Once Prometheus has reloaded its configuration and Caddy is running with the metrics enabled, Prometheus should start scraping the /metrics endpoint on localhost:9180. You can verify this by navigating to the "Status" -> "Targets" page in the Prometheus UI. You should see your caddy job with localhost:9180 as a target, and its state should be "UP".
The metrics exposed by Caddy include request counts, response sizes, latency, and various internal state information. These can be invaluable for monitoring the performance and health of your Caddy web server.
The most surprising thing about Caddy’s metrics is that they are not just simple counters; they are dynamic and context-aware, reflecting the actual HTTP traffic and Caddy’s internal state in real-time.
When Caddy receives an HTTP request, it increments relevant counters. For example, a request to /api/users that results in a 200 OK response will increment caddy_http_requests_total{method="GET",uri="/api/users",status_code="200"}. This granular labeling allows for powerful filtering and aggregation in Prometheus.
Here’s a glimpse of what the raw metrics output might look like if you were to curl localhost:9180/metrics:
# HELP caddy_http_requests_total Total number of HTTP requests.
# TYPE caddy_http_requests_total counter
caddy_http_requests_total{direction="in",host="localhost",method="GET",path="/",status_code="200"} 1234
caddy_http_requests_total{direction="in",host="localhost",method="POST",path="/submit",status_code="500"} 5
# HELP caddy_http_request_duration_seconds Request duration in seconds.
# TYPE caddy_http_request_duration_seconds histogram
caddy_http_request_duration_seconds_bucket{direction="in",host="localhost",le="0.1",status_code="200"} 1000
caddy_http_request_duration_seconds_bucket{direction="in",host="localhost",le="0.5",status_code="200"} 1200
caddy_http_request_duration_seconds_bucket{direction="in",host="localhost",le="+Inf",status_code="200"} 1234
caddy_http_request_duration_seconds_sum{direction="in",host="localhost",status_code="200"} 150.5
caddy_http_request_duration_seconds_count{direction="in",host="localhost",status_code="200"} 1234
# HELP caddy_caddy_reloads_total Total number of Caddy reloads.
# TYPE caddy_caddy_reloads_total counter
caddy_caddy_reloads_total 5
The caddy_http_requests_total metric, for instance, is a counter that gets incremented for every incoming request. Its labels provide rich context: direction (in/out), host (the requested host header), method (GET, POST, etc.), path (the requested URI path), and status_code (200, 404, 500, etc.). This allows you to ask questions like "How many 500 errors did I get for POST requests to the /api path yesterday?"
The caddy_http_request_duration_seconds metric is a histogram, which is Caddy’s way of providing detailed latency distributions. Histograms track observations and group them into configurable buckets (e.g., le="0.1" means "less than or equal to 0.1 seconds"). This allows you to calculate not just the average latency, but also percentiles like the 95th or 99th percentile, which are often more indicative of user experience.
The caddy_caddy_reloads_total metric is a simple counter indicating how many times Caddy’s configuration has been reloaded since it started. This can be useful for debugging or understanding configuration update frequency.
One aspect that often trips people up is understanding how Caddy’s internal directives and modules contribute to the metrics. For example, if you have a rate_limit directive, Caddy will expose metrics related to the rate limiting itself, such as caddy_rate_limit_requests_total. Similarly, if you’re using reverse_proxy, you’ll see metrics about upstream health and request forwarding. The metrics directive itself doesn’t add metrics; it merely exposes the metrics generated by other parts of Caddy.
The next step after setting up basic metrics scraping is to explore advanced Prometheus features like alerting rules based on Caddy’s performance, or using Grafana to build sophisticated dashboards visualizing Caddy’s traffic patterns and health.