Rate Limit Requests in Caddy to Protect Your Backend (2026)

Caddy’s rate limiting is a surprisingly flexible tool that can protect your backend services without acting like a blunt instrument.

Let’s see it in action. Imagine you have a backend API at http://localhost:8080 and you want to limit requests to 10 per minute per IP address. Here’s how you’d configure Caddy:

yourdomain.com {
    reverse_proxy localhost:8080 {
        rate_limit {
            zone ip
            window 1m
            requests 10
        }
    }
}

This configuration tells Caddy to monitor incoming requests to yourdomain.com. For each unique IP address (zone ip), it will allow a maximum of 10 requests (requests 10) within any given one-minute window (window 1m). If an IP exceeds this limit, Caddy will return a 429 Too Many Requests response.

The problem this solves is simple: an overloaded or misbehaving client can bring down your backend. This could be a legitimate but poorly written script, a bot, or even a denial-of-service attack. By offloading the responsibility of enforcing request limits to Caddy, your backend application doesn’t have to deal with the overhead of tracking and rejecting excessive requests, keeping it free to serve legitimate traffic.

Internally, Caddy maintains a "zone" for each rate limit configuration. This zone is where it stores the counts and timestamps for each entity being limited (e.g., IP addresses, client headers, etc.). When a request comes in, Caddy checks if the entity associated with that request exists in the zone and if the request count within the defined window has been exceeded. If it has, the request is immediately rejected. If not, the count is incremented, and the request is forwarded to the backend.

The zone directive is where the real flexibility comes in. While ip is common, you can also limit based on other criteria. For example, to limit based on a specific header, like an API key:

yourdomain.com {
    reverse_proxy localhost:8080 {
        rate_limit {
            zone header X-API-Key
            window 1m
            requests 100
        }
    }
}

Here, each unique value found in the X-API-Key header gets its own rate limit of 100 requests per minute. This is great for tiered API access where different clients might have different allowance levels.

Another powerful option is zone uri, which limits requests to specific URL paths. This is useful if a particular endpoint is more resource-intensive than others.

yourdomain.com {
    reverse_proxy localhost:8080 {
        handle_path /api/v1/* {
            rate_limit {
                zone uri
                window 1h
                requests 1000
            }
            reverse_proxy localhost:8080
        }
        handle_path /api/v2/* {
            rate_limit {
                zone uri
                window 1h
                requests 500
            }
            reverse_proxy localhost:8080
        }
    }
}

In this scenario, requests to /api/v1/* are limited to 1000 per hour, while requests to /api/v2/* are limited to 500 per hour. Notice how we use handle_path to apply different rate limits to different parts of the URL.

The requests directive specifies the maximum number of requests allowed within the window. The window directive defines the time duration for the rate limit. You can use suffixes like s for seconds, m for minutes, h for hours, and d for days.

One of the most impactful, yet often overlooked, aspects of Caddy’s rate limiting is its ability to define custom error responses. Instead of just returning a generic 429, you can provide more context or even redirect the user to a helpful page.

yourdomain.com {
    reverse_proxy localhost:8080 {
        rate_limit {
            zone ip
            window 1m
            requests 10
            error_response "Too many requests from this IP. Please try again later." 429
        }
    }
}

This adds a custom error message directly into the 429 response body, making it clearer to the client why they are being throttled.

Beyond the basic zone, window, and requests, Caddy offers include_unmatched. When set to true, it means that requests that do not match any handle or handle_path blocks will also be subject to the rate limit defined in the reverse_proxy block. This can be useful for catching unexpected or malformed requests that might otherwise bypass your rate limiting.

The actual enforcement of these limits is managed by Caddy’s internal state. For persistent rate limiting across Caddy restarts, you’d typically configure an external storage mechanism, like Redis. This ensures that your limits are not reset every time Caddy reloads its configuration. Without external storage, the rate limit counts are held in memory and will be lost on restart.

The next step in managing traffic flow is often implementing circuit breaking to prevent cascading failures when a backend is truly unhealthy.