Protect APIs with API Keys and Usage Plans (2026)

An API key isn’t a secret; it’s a credential that identifies your application to the API provider.

Let’s see this in action. Imagine a simple weather API. We want to control who can access it and how much they can use it.

# Example API Key Generation (Conceptual)
# In a real system, this would be handled by an API gateway or management platform.
api_key = generate_secure_random_string(32)
owner_id = "user_123"
plan_id = "free_tier_plan"

# Associate key with owner and plan
database.insert("api_keys", {"key": api_key, "owner_id": owner_id, "plan_id": plan_id})

# Example Usage Plan Definition
plan_name: "free_tier_plan"
rate_limit_per_minute: 100
rate_limit_per_hour: 500
burst_limit_per_minute: 120 # Allows slightly more than rate_limit_per_minute for short bursts

# Example API Request with Key
request = HttpRequest("GET", "https://api.weather.com/v1/current?location=London")
request.headers["X-API-Key"] = api_key

# API Gateway/Server-side Logic
def handle_request(request):
    api_key = request.headers.get("X-API-Key")
    if not api_key:
        return HttpResponse(401, "API Key missing")

    key_info = database.get("api_keys", "key", api_key)
    if not key_info:
        return HttpResponse(403, "Invalid API Key")

    plan = database.get("usage_plans", "plan_id", key_info["plan_id"])
    if not plan:
        return HttpResponse(500, "Configuration error: Usage plan not found")

    # Rate Limiting Logic (simplified)
    current_time = get_current_timestamp()
    request_count_minute = cache.increment_and_get(f"rate_limit:{api_key}:minute")
    request_count_hour = cache.increment_and_get(f"rate_limit:{api_key}:hour")

    if request_count_minute > plan["rate_limit_per_minute"] or \
       request_count_hour > plan["rate_limit_per_hour"]:
        return HttpResponse(429, "Rate limit exceeded")

    # Process the actual API request
    response = forward_to_weather_service(request)
    return response

This setup addresses a fundamental problem: how to grant access to your valuable data or functionality while preventing abuse and ensuring fair usage. Without these controls, a single errant script or a malicious actor could overwhelm your service, impacting all users and incurring significant costs.

Internally, an API gateway or a dedicated middleware layer typically intercepts every incoming request. It first validates the presence and authenticity of the X-API-Key header. If the key is valid, it then looks up the associated usage plan. This plan dictates the constraints, such as the maximum number of requests allowed within a given time window (e.g., per minute, per hour, per day) and potentially a burst limit for short, intense periods. This information is often stored in a database or a configuration store.

The core of the enforcement lies in tracking request counts. A common approach uses a distributed cache like Redis to store counters for each API key and time window. When a request arrives, the corresponding counter is incremented. If the incremented value exceeds the limit defined in the usage plan, the gateway rejects the request with a 429 Too Many Requests status code. This check needs to be atomic to prevent race conditions where multiple requests arriving simultaneously might all pass the check before any counter is updated.

The burst_limit_per_minute is a subtle but powerful lever. It allows a user to exceed the rate_limit_per_minute for a short duration, provided they don’t exceed it averaged over a longer period. This is crucial for applications that might have sudden spikes in legitimate traffic, like the start of a popular event or a flash sale. The system might track requests within a sliding window or use a token bucket algorithm to manage this.

The most surprising part is how often the absence of a rate limit, rather than a faulty one, is the root cause of unexpected service degradation. Developers might expose an API with good intentions, assuming users will be reasonable, only to find a single client consuming 99% of their resources because it’s aggressively polling for updates. The system doesn’t inherently know what "reasonable" is; it needs explicit instructions via usage plans.

The next step is to consider authentication methods beyond simple API keys, such as OAuth 2.0 for delegated authorization.