Rate Limit API and Web Traffic with Cloudflare Rules (2026)

Cloudflare’s rate limiting is not about simply capping requests; it’s a dynamic traffic shaping mechanism that uses a sliding window to identify and mitigate abusive or accidental traffic surges before they impact your origin servers.

Let’s see it in action. Imagine a scenario where a bot is hammering your login endpoint (/api/v1/login) with thousands of requests per minute. We want to block any IP address that makes more than 100 requests to this specific endpoint within a 1-minute window.

Here’s how you’d configure that in Cloudflare:

Navigate to Security > WAF > Rate Limiting Rules in your Cloudflare dashboard.
Click "Create rule."
Rule Name: Block High Login Traffic
Description: Block IPs making >100 requests to /api/v1/login per minute.
When incoming requests match…
- Field: URI Path
- Operator: equals
- Value: /api/v1/login
…and requests are made by the same IP address… (This is the default and what we want for IP-based rate limiting).
…then take this action:
- Action: Block
- Rate: 100 requests
- Period: 1 minute

This rule, once deployed, will monitor traffic hitting /api/v1/login. If any single IP address exceeds 100 requests within a 60-second rolling window, Cloudflare will automatically return a 429 Too Many Requests status code to that IP for subsequent requests within that window.

The core problem rate limiting solves is protecting your application’s availability and performance from overwhelming traffic. This can be legitimate but unexpected surges (like a viral social media post) or malicious attacks (like DDoS or credential stuffing). By setting these limits, you create a buffer, ensuring that even if a flood of requests hits, only a controlled amount reaches your origin, preventing overload, downtime, and increased infrastructure costs.

Internally, Cloudflare uses a distributed system to track request counts per client identifier (by default, the IP address) within defined time windows. When a request arrives, Cloudflare checks if the client identifier has already exceeded the threshold for the specified period. If it has, the request is blocked. If not, the count is incremented. The "sliding window" is crucial: it’s not a fixed calendar minute (e.g., 10:00:00 to 10:00:59), but rather a rolling 60-second period from the current moment. So, if a request comes in at 10:00:30, the window is from 09:59:30 to 10:00:30. This prevents "bursts" at the edge of a fixed window.

The levers you control are the criteria (which requests to monitor – by URL, method, headers, cookies, etc.), the client identifier (what to count requests against – IP, specific header, Cloudflare User ID, etc.), the threshold (how many requests are too many), and the period (over what duration the threshold applies). You can also choose different actions: Block, Managed Challenge, JS Challenge, CAPTCHA, or Log. For more sophisticated scenarios, you can combine multiple criteria and even use custom expressions.

A common misconception is that rate limiting only applies to new connections. In reality, it applies to every request that matches the rule’s criteria, regardless of whether the underlying TCP connection is persistent or new. This means even if a client is using HTTP/2 or HTTP/3 with connection reuse, each individual HTTP request is counted against the rate limit.

The next step in managing traffic effectively is understanding how to use Custom Rate Limiting Expressions to create more granular and context-aware rules.