A single API Gateway deployment is a single point of failure for your entire application.
Here’s how we’ll set up a multi-region API Gateway failover, making your API resilient to regional outages. We’ll use AWS Route 53 and API Gateway in two regions.
The Setup
Imagine we have two identical API Gateway deployments, one in us-east-1 (N. Virginia) and another in eu-west-1 (Ireland). Both are serving the exact same API.
Region 1: us-east-1
- API Gateway:
api.example.com(created inus-east-1) - Deployment:
/v1/usersendpoint, pointing to an EC2 instance. - Custom Domain Name:
api.example.com(associated with theus-east-1deployment) - ACM Certificate:
arn:aws:acm:us-east-1:123456789012:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx(forapi.example.cominus-east-1)
Region 2: eu-west-1
- API Gateway:
api.example.com(created ineu-west-1) - Deployment:
/v1/usersendpoint, pointing to an EC2 instance. - Custom Domain Name:
api.example.com(associated with theeu-west-1deployment) - ACM Certificate:
arn:aws:acm:eu-west-1:123456789012:certificate/yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy(forapi.example.comineu-west-1)
Global Traffic Management: Route 53
- Hosted Zone:
example.com - Record Set 1 (Primary):
- Name:
api.example.com - Type:
A - Alias: Yes
- Alias Target:
d-xxxxxxxxxx.execute-api.us-east-1.amazonaws.com(API Gateway regional endpoint forus-east-1) - Evaluate Target Health: Yes
- Health Check ID:
hc-us-east-1-api(points to the health check for ourus-east-1API)
- Name:
- Record Set 2 (Secondary):
- Name:
api.example.com - Type:
A - Alias: Yes
- Alias Target:
d-yyyyyyyyyy.execute-api.eu-west-1.amazonaws.com(API Gateway regional endpoint foreu-west-1) - Evaluate Target Health: Yes
- Health Check ID:
hc-eu-west-1-api(points to the health check for oureu-west-1API)
- Name:
- Routing Policy:
Failover- Primary Record: The
us-east-1record set. - Secondary Record: The
eu-west-1record set.
- Primary Record: The
How It Works
Route 53’s Failover routing policy is the magic here. When a client (like a browser or application) makes a request to api.example.com, Route 53 checks the health of the primary record (us-east-1).
- Primary Healthy: If the health check for
us-east-1passes, Route 53 returns the IP address for theus-east-1API Gateway. All traffic goes tous-east-1. - Primary Unhealthy, Secondary Healthy: If the health check for
us-east-1fails, Route 53 automatically stops returning its IP address. It then checks the health of the secondary record (eu-west-1). If that health check passes, Route 53 returns the IP address for theeu-west-1API Gateway. Traffic is seamlessly redirected toeu-west-1. - Both Unhealthy: If both health checks fail, Route 53 will not return any IP addresses for
api.example.com, and clients will be unable to reach the API.
Health Checks
Crucially, you need robust health checks for each API Gateway deployment. These health checks should verify not just that the API Gateway is running, but that it can successfully reach its backend and return a valid response.
Example Health Check Configuration (for us-east-1):
- Name:
api-us-east-1-health - Monitored Resource:
API Gateway - API Gateway Domain Name:
d-xxxxxxxxxx.execute-api.us-east-1.amazonaws.com - Request Path:
/health(assuming you have a/healthendpoint on your API Gateway that returns a 200 OK) - Port:
443 - Protocol:
HTTPS - Failure Threshold:
3(number of consecutive failures before the endpoint is considered unhealthy) - Request Interval:
30seconds
You would create an identical health check for the eu-west-1 deployment, adjusting the domain name and region.
DNS Propagation and TTL
Route 53 health checks are checked every 30 seconds by default. When a health check fails, it takes a few DNS propagation cycles for the change to be reflected globally. The default TTL for your Route 53 records will influence how quickly clients pick up the change. For failover scenarios, you might want to set a lower TTL (e.g., 60 seconds) on your api.example.com records to ensure faster failover. However, be mindful of increased DNS query costs.
The ACM certificates are managed independently per region. When you create a custom domain name in API Gateway, you associate it with an ACM certificate from that same region. Route 53’s Alias records then point to the regional API Gateway endpoints.
The most surprising thing about this setup is how little you actually change about your API Gateway deployments themselves – the heavy lifting for failover is entirely managed by Route 53’s intelligent health checking and DNS routing. You’re essentially running two independent, identical services and letting a global traffic manager decide which one is available.
Once your multi-region API Gateway failover is operational, the next challenge is ensuring your data is also replicated across regions for a truly disaster-proof application.