A single API Gateway deployment is a single point of failure for your entire application.

Here’s how we’ll set up a multi-region API Gateway failover, making your API resilient to regional outages. We’ll use AWS Route 53 and API Gateway in two regions.

The Setup

Imagine we have two identical API Gateway deployments, one in us-east-1 (N. Virginia) and another in eu-west-1 (Ireland). Both are serving the exact same API.

Region 1: us-east-1

  • API Gateway: api.example.com (created in us-east-1)
  • Deployment: /v1/users endpoint, pointing to an EC2 instance.
  • Custom Domain Name: api.example.com (associated with the us-east-1 deployment)
  • ACM Certificate: arn:aws:acm:us-east-1:123456789012:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (for api.example.com in us-east-1)

Region 2: eu-west-1

  • API Gateway: api.example.com (created in eu-west-1)
  • Deployment: /v1/users endpoint, pointing to an EC2 instance.
  • Custom Domain Name: api.example.com (associated with the eu-west-1 deployment)
  • ACM Certificate: arn:aws:acm:eu-west-1:123456789012:certificate/yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy (for api.example.com in eu-west-1)

Global Traffic Management: Route 53

  • Hosted Zone: example.com
  • Record Set 1 (Primary):
    • Name: api.example.com
    • Type: A
    • Alias: Yes
    • Alias Target: d-xxxxxxxxxx.execute-api.us-east-1.amazonaws.com (API Gateway regional endpoint for us-east-1)
    • Evaluate Target Health: Yes
    • Health Check ID: hc-us-east-1-api (points to the health check for our us-east-1 API)
  • Record Set 2 (Secondary):
    • Name: api.example.com
    • Type: A
    • Alias: Yes
    • Alias Target: d-yyyyyyyyyy.execute-api.eu-west-1.amazonaws.com (API Gateway regional endpoint for eu-west-1)
    • Evaluate Target Health: Yes
    • Health Check ID: hc-eu-west-1-api (points to the health check for our eu-west-1 API)
  • Routing Policy: Failover
    • Primary Record: The us-east-1 record set.
    • Secondary Record: The eu-west-1 record set.

How It Works

Route 53’s Failover routing policy is the magic here. When a client (like a browser or application) makes a request to api.example.com, Route 53 checks the health of the primary record (us-east-1).

  1. Primary Healthy: If the health check for us-east-1 passes, Route 53 returns the IP address for the us-east-1 API Gateway. All traffic goes to us-east-1.
  2. Primary Unhealthy, Secondary Healthy: If the health check for us-east-1 fails, Route 53 automatically stops returning its IP address. It then checks the health of the secondary record (eu-west-1). If that health check passes, Route 53 returns the IP address for the eu-west-1 API Gateway. Traffic is seamlessly redirected to eu-west-1.
  3. Both Unhealthy: If both health checks fail, Route 53 will not return any IP addresses for api.example.com, and clients will be unable to reach the API.

Health Checks

Crucially, you need robust health checks for each API Gateway deployment. These health checks should verify not just that the API Gateway is running, but that it can successfully reach its backend and return a valid response.

Example Health Check Configuration (for us-east-1):

  • Name: api-us-east-1-health
  • Monitored Resource: API Gateway
  • API Gateway Domain Name: d-xxxxxxxxxx.execute-api.us-east-1.amazonaws.com
  • Request Path: /health (assuming you have a /health endpoint on your API Gateway that returns a 200 OK)
  • Port: 443
  • Protocol: HTTPS
  • Failure Threshold: 3 (number of consecutive failures before the endpoint is considered unhealthy)
  • Request Interval: 30 seconds

You would create an identical health check for the eu-west-1 deployment, adjusting the domain name and region.

DNS Propagation and TTL

Route 53 health checks are checked every 30 seconds by default. When a health check fails, it takes a few DNS propagation cycles for the change to be reflected globally. The default TTL for your Route 53 records will influence how quickly clients pick up the change. For failover scenarios, you might want to set a lower TTL (e.g., 60 seconds) on your api.example.com records to ensure faster failover. However, be mindful of increased DNS query costs.

The ACM certificates are managed independently per region. When you create a custom domain name in API Gateway, you associate it with an ACM certificate from that same region. Route 53’s Alias records then point to the regional API Gateway endpoints.

The most surprising thing about this setup is how little you actually change about your API Gateway deployments themselves – the heavy lifting for failover is entirely managed by Route 53’s intelligent health checking and DNS routing. You’re essentially running two independent, identical services and letting a global traffic manager decide which one is available.

Once your multi-region API Gateway failover is operational, the next challenge is ensuring your data is also replicated across regions for a truly disaster-proof application.

Want structured learning?

Take the full Apigateway course →