Implement Rate Limiting in ASP.NET Core Middleware (2026)

ASP.NET Core’s built-in rate limiting middleware is actually a sophisticated policy engine that allows you to enforce granular request quotas per client, per endpoint, or even per HTTP method, and it does so by leveraging distributed counters managed by a configurable store.

Let’s see it in action. Imagine you have a public API endpoint that’s getting hammered, and you want to limit users to 10 requests per minute.

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.
builder.Services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

    options.AddFixedWindowLimiter(
        policyName: "fixed",
        permitLimit: 10,
        window: TimeSpan.FromMinutes(1),
        onRejected: (context, reason) =>
        {
            context.HttpContext.Response.WriteAsync($"Too many requests. Please try again later. Reason: {reason.ReasonPhrase}");
            return new ValueTask();
        });
});

var app = builder.Build();

// Configure the HTTP request pipeline.
app.UseRateLimiter(); // This must be called after UseRouting and before UseEndpoints

app.MapGet("/", () => "Hello World!");

app.Run();

Here, AddFixedWindowLimiter sets up a policy named "fixed". Any client exceeding 10 requests within a one-minute window will receive a 429 Too Many Requests status code. The onRejected delegate provides a custom response body.

The core problem this solves is preventing abuse and ensuring service availability. Without rate limiting, a single user or a bot could overwhelm your application, leading to denial of service for legitimate users. ASP.NET Core’s middleware provides a declarative way to define these limits without cluttering your controllers or business logic.

Internally, the AddRateLimiter extension method registers a set of services, most notably IRateLimiter and IRateLimiterPolicy. The UseRateLimiter() middleware intercepts incoming HTTP requests. For each request, it determines which rate limiting policies apply based on configuration (e.g., global, per-endpoint, per-route). It then consults the configured IRateLimiter (which defaults to an in-memory implementation but can be replaced with distributed stores like Redis for scalability) to check if the request violates any active policies. If a violation occurs, the configured rejection logic is executed; otherwise, the request proceeds.

The RateLimiterOptions allow for global configuration. RejectionStatusCode is straightforward, but GlobalLimiter lets you define a default policy if none is explicitly applied. You can also specify OnLimiterRejected to handle rejections globally.

When you add specific limiters like AddFixedWindowLimiter, AddSlidingWindowLimiter, AddTokenBucketLimiter, or AddLeakyBucketLimiter, you give them a policyName. This name is crucial for applying the policy to specific endpoints using EndpointRateLimiter.SetRateLimiterPolicy(policyName).

For example, to apply the "fixed" policy only to a specific endpoint:

app.MapGet("/api/limited", () => "This is a rate-limited endpoint.")
   .WithRateLimiterPolicy("fixed");

The WithRateLimiterPolicy extension method attaches metadata to the endpoint, which the UseRateLimiter middleware reads to apply the correct policy. The middleware effectively acts as a gatekeeper, deciding whether to allow a request to reach your application logic based on predefined rules.

A common misconception is that the rate limiter only tracks requests globally. However, the system is designed to be highly granular. You can define policies that are specific to:

Per-Route: Applying a limiter to all requests matching a particular route pattern.
Per-Endpoint: As shown with WithRateLimiterPolicy, targeting a single MapGet or MapPost.
Per-HTTP Method: You can define different policies for GET, POST, etc., on the same endpoint.
Per-Client: This is where distributed stores become vital. By configuring the RateLimiterOptions to use a DistributedRateLimiter (e.g., Microsoft.AspNetCore.RateLimiting.Distributed.Redis), you can maintain separate counters for each client, identified by IP address, API key, or any other identifier you can derive from the HttpContext.Connection.RemoteIpAddress or request headers.

The AddTokenBucketLimiter policy, for instance, offers a more sophisticated approach than fixed windows. It allows for bursts of requests up to a certain capacity (TokenLimit) and replenishes tokens at a steady rate (ReplenishmentPeriod and TokensPerPeriod). This provides a smoother experience for users, allowing them to occasionally exceed the average rate without immediate rejection, as long as they don’t exhaust the bucket.

When implementing AddTokenBucketLimiter, remember that ReplenishmentPeriod and TokensPerPeriod work together. If ReplenishmentPeriod is TimeSpan.FromSeconds(10) and TokensPerPeriod is 5, it means 5 tokens are added every 10 seconds, resulting in an average rate of 0.5 tokens per second.

The most surprising thing about ASP.NET Core’s rate limiting is how seamlessly it integrates with the endpoint routing system, allowing for policies to be attached directly to endpoints as metadata, which the middleware then interprets at runtime. This means you can define rate limits directly alongside your API route definitions, making the configuration co-located with the code it governs.

The next concept you’ll likely encounter is managing rate limiting across multiple instances of your application in a distributed environment, which necessitates moving beyond the default in-memory store to something like Redis or another distributed cache.