A CDN can’t actually load balance in the traditional sense; it routes traffic to the closest origin, not necessarily the least loaded.

Let’s see this in action. Imagine you have two origin servers, origin1.example.com and origin2.example.com, both serving the same content. A user in Tokyo requests www.example.com/image.jpg.

# User's DNS resolver queries for www.example.com
# DNS resolver asks CDN's authoritative DNS servers
# CDN's DNS servers respond with the IP address of a CDN edge server *closest to Tokyo*

# User's browser connects to the Tokyo CDN edge server

# The CDN edge server has cached image.jpg?
# If YES: Serve from cache. Done.
# If NO: The CDN edge server needs to fetch it from an origin.

Now, the CDN edge server needs to decide which origin to ask. This is where the "load balancing" myth comes in. The CDN’s primary directive is latency. It will pick the origin that appears to be fastest from its own perspective. This is usually determined by:

  1. Geographic Proximity: The origin server physically located closest to the CDN edge server.
  2. Network Latency: The measured round-trip time (RTT) between the CDN edge server and each origin.

So, if origin1.example.com is in San Francisco and origin2.example.com is in New York, and the CDN edge server is in Tokyo:

  • The CDN edge server will ping both origin1.example.com and origin2.example.com.
  • It will likely see lower RTT to origin2.example.com (New York) than origin1.example.com (San Francisco) due to network topology.
  • Therefore, the CDN will send the request to origin2.example.com.

This is not load balancing. If origin2.example.com is overloaded and origin1.example.com is idle, the CDN will still send the request to origin2.example.com because it’s perceived as faster.

How CDNs actually handle multiple origins:

Most CDNs offer a feature called "Origin Load Balancing" or "Origin Failover." This isn’t about distributing load based on server capacity, but rather ensuring availability.

When you configure multiple origins, you typically provide a list of hostnames or IP addresses. The CDN then uses a combination of health checks and latency measurements to select an origin.

  • Health Checks: The CDN periodically sends a small request (e.g., an HTTP HEAD request to /healthcheck) to each origin. If an origin fails to respond within a certain timeout (e.g., 5 seconds) or returns an error status code (e.g., 5xx), it’s marked as unhealthy.
  • Origin Selection: When an edge server needs to fetch content:
    • It first considers only the healthy origins.
    • Among the healthy origins, it selects the one with the lowest measured latency from its location.
    • If all healthy origins are experiencing high latency, it might still pick the "least worst" one.

Configuring Multiple Origins (Conceptual Example - Akamai):

In a CDN like Akamai, you’d define a "Primary" and "Secondary" origin server, or a list of origins with weights.

  • Primary Origin: origin1.example.com (e.g., IP 192.0.2.10)
  • Secondary Origin: origin2.example.com (e.g., IP 192.0.2.11)

The CDN will always try to use origin1.example.com first, as long as it passes health checks. If origin1.example.com becomes unresponsive, the CDN will automatically switch to origin2.example.com. This is failover, not load balancing.

What if you really want load balancing?

If you need true load balancing based on server load, you need to implement it before the CDN. This typically involves:

  1. A Dedicated Load Balancer: Place a hardware or software load balancer (like HAProxy, Nginx, or a cloud provider’s LB service) in front of your origin servers.
  2. CDN Points to the Load Balancer: Configure your CDN to point to the IP address or hostname of this load balancer as its single origin.
  3. Load Balancer Distributes: The load balancer then distributes traffic across your multiple origin servers based on algorithms like least connections, round robin, or weighted round robin.

Example Configuration Snippet (Nginx as Origin Load Balancer):

http {
    upstream origin_pool {
        # Least connections: sends request to the server with the fewest active connections
        least_conn;
        server origin1.example.com:8080 weight=10; # 10x more traffic if healthy
        server origin2.example.com:8080 weight=5;  # 5x more traffic if healthy
        server origin3.example.com:8080 backup;   # Only used if origin1 & origin2 fail
    }

    server {
        listen 80;
        server_name origin-lb.example.com; # This is what the CDN points to

        location / {
            proxy_pass http://origin_pool;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            # ... other proxy headers
        }
    }
}

In this setup, the CDN edge server requests origin-lb.example.com. Nginx, acting as the load balancer, then intelligently forwards that request to one of origin1, origin2, or origin3 based on its configuration.

The most surprising thing about CDN origin selection is that it doesn’t inherently understand server load; its primary goal is to minimize latency from the edge’s perspective. This means a geographically closer but overloaded origin will be favored over a farther, healthy one if the CDN’s internal metrics dictate it.

The next concept you’ll likely encounter is managing cache invalidation across multiple origins when the content is dynamic or changes frequently.

Want structured learning?

Take the full Cdn course →