API Gateway’s throttling and rate limiting aren’t about preventing abuse, they’re about preventing your own services from melting under load.

Let’s watch this happen. Imagine we have a simple API Gateway setup with a single endpoint /users that forwards to a backend Lambda function.

{
  "apiId": "abcdef123",
  "stageName": "prod",
  "routes": [
    {
      "path": "/users",
      "method": "GET",
      "integration": {
        "type": "AWS_PROXY",
        "integrationUri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:123456789012:function:my-user-lambda/invocations"
      },
      "rateLimit": {
        "burstLimit": 100,
        "fixedWindow": "60s"
      }
    }
  ]
}

Here, we’ve defined a rate limit on the /users GET route. A burstLimit of 100 means we can handle up to 100 requests in rapid succession. The fixedWindow of 60s means this limit resets every minute. So, in any given 60-second window, we allow a maximum of 100 requests to hit the Lambda function.

But wait, there’s more. This rate limiting is per API key. If you don’t use API keys, it’s effectively global for that route. To see this in action, we need to enable API key usage and require it for the method.

First, create an API key: aws apigateway create-api-key --name "MyUserApiKey"

This will output something like:

{
    "id": "a1b2c3d4e5",
    "name": "MyUserApiKey",
    "value": "x1y2z3w4v5u6t7s8r9q0p1o2n3m4l5k6",
    "enabled": true,
    "createdDate": "2023-10-27T10:00:00Z"
}

Then, associate it with a usage plan. A usage plan is what groups API keys and defines the throttling and quota settings.

{
  "usagePlanId": "zyxwvu",
  "name": "UserPlan",
  "apiStages": [
    {
      "apiId": "abcdef123",
      "stage": "prod"
    }
  ],
  "throttle": {
    "burstLimit": 50,
    "rateLimit": 10
  },
  "tags": {}
}

This usage plan has a burstLimit of 50 and a rateLimit of 10 requests per second. This is in addition to any rate limits set at the route level. API Gateway evaluates both. If a route has a 100 requests per minute limit and the usage plan has a 10 requests per second limit (which is 600 requests per minute), the lower limit wins. In this case, the usage plan’s 10 requests per second is the tighter constraint.

Now, associate the API key with the usage plan: aws apigateway associate-api-key-to-usage-plan --usage-plan-id zyxwvu --api-key-id a1b2c3d4e5

Finally, configure the route to require the API key: Modify your API Gateway configuration (via the console or aws apigateway update-integration for example) to set apiKeyRequired to true for the /users GET method.

Now, when you send requests to /users with the x-api-key: x1y2z3w4v5u6t7s8r9q0p1o2n3m4l5k6 header, API Gateway will enforce the 10 requests per second limit. If you exceed this, you’ll get a 429 Too Many Requests response.

The key insight here is how these limits interact. Route-level limits are specific to a path/method combination, while usage plan limits are applied to all APIs/stages within that plan, associated via API keys. API Gateway enforces the most restrictive combination.

The one thing most people don’t realize is that the fixedWindow for route-level rate limiting is not synchronized across all requests. Each request that hits the limit starts its own independent 60-second countdown for its allowance. This can lead to surprisingly high aggregate throughput if requests are spaced just right, even if the average rate is low.

If you’re seeing 429 errors, the next step is often to look at your CloudWatch logs for your API Gateway deployment. You’ll see metrics like 4XXError and 5XXError increase, and if you’ve enabled access logging, you’ll see specific entries indicating throttling.

Want structured learning?

Take the full Apigateway course →