Caching is often seen as a black box, but understanding how your CDN interprets Cache-Control headers can unlock significantly higher hit rates, meaning more of your requests are served directly from the CDN’s edge, not from your origin.
Let’s see this in action. Imagine you have a static asset, say an image, served from your web server.
GET /images/logo.png HTTP/1.1
Host: example.com
User-Agent: curl/7.64.1
Accept: */*
And your origin server responds with:
HTTP/1.1 200 OK
Content-Type: image/png
Last-Modified: Tue, 15 Nov 2022 10:00:00 GMT
ETag: "12345-abcdef"
Cache-Control: public, max-age=3600
Content-Length: 12345
When a user’s browser requests this asset, the CDN will likely cache it. The public directive means it can be cached by any cache, including intermediate proxies and CDNs. max-age=3600 tells the cache (including the CDN) that this resource is fresh for 3600 seconds (1 hour) from the time it was generated.
Now, if another user, or even the same user from a different device, requests /images/logo.png within that hour, the CDN will serve it directly from its cache.
GET /images/logo.png HTTP/1.1
Host: cdn.example.com
User-Agent: curl/7.64.1
Accept: */*
The CDN’s response will look something like this, indicating it’s serving from its cache:
HTTP/1.1 200 OK
Content-Type: image/png
Content-Length: 12345
Date: Tue, 15 Nov 2022 10:15:00 GMT
Age: 900
Cache-Control: public, max-age=3600
X-Cache: HIT from origin.example.com
The Age header shows how long (in seconds) the object has been in the CDN’s cache. A HIT in X-Cache (or similar CDN-specific header) confirms it was served from the cache.
The problem most developers face is that they either don’t set Cache-Control headers at all, or they set them too aggressively or too permissively, leading to stale content or unnecessary origin requests.
The Core Problem: Cache Invalidation vs. Cache Freshness
The fundamental tension in caching is between serving stale data and making expensive origin requests. Cache-Control headers are your primary tool to tell caches (including CDNs) how to manage this balance. The goal is to maximize the time resources can be served from the cache (high hit rate) without serving outdated content.
Key Directives and Their Impact on CDNs
publicvs.private:publicallows any cache (browser, proxy, CDN) to store the response.privaterestricts caching to the end-user’s browser. For CDNs, you almost always wantpublic. Aprivatedirective will prevent the CDN from caching the asset, effectively negating its purpose for that resource.max-age=<seconds>: This is the most crucial directive. It tells the cache how long a resource is considered "fresh." For static assets that change infrequently (images, CSS, JS files), setting a longmax-age(e.g.,max-age=31536000for a year) is ideal for hit rates. For dynamic content that changes frequently,max-age=0ormax-age=60is more appropriate.s-maxage=<seconds>: This directive is specifically for shared caches like CDNs and proxies. It overridesmax-agefor these caches. If you want your CDN to cache for a year but browsers only for an hour, you’d uses-maxage=31536000, max-age=3600. This is powerful for optimizing CDN usage.no-cache: This doesn’t mean "don’t cache." It means "cache, but always revalidate with the origin before serving." The cache will store the response, but on each subsequent request, it will send a conditional request (e.g.,If-None-MatchorIf-Modified-Since) to the origin. If the origin says304 Not Modified, the cache serves the stored copy. This is a good compromise for assets that change somewhat frequently but you still want the benefit of a quick check.no-store: This means "do not cache at all." The response is sent directly to the client, and no cache (browser, CDN, proxy) is allowed to store it. Use this only for highly sensitive or rapidly changing data where caching is explicitly undesirable.must-revalidate/proxy-revalidate: These tell the cache that once a resource becomes stale (itsmax-ageexpires), it must revalidate with the origin. It cannot serve a stale copy even if the origin is unavailable. This is important for ensuring data freshness when strictness is required.
Configuration Example: Optimal Static Asset Caching
For static assets (e.g., /static/*, *.js, *.css, *.jpg, *.png) that are versioned or fingerprinted (meaning their filename changes when content changes, like app.a1b2c3d4.js), you can set extremely long cache times.
Your origin server’s HTTP headers (or your CDN’s rules if it supports header manipulation) might look like this:
Cache-Control: public, max-age=31536000, immutable
public: Allows CDN caching.max-age=31536000: Cache for one year.immutable: A newer directive indicating the resource will never change. This tells caches they don’t even need to check for updates, further improving performance.
For assets that are not fingerprinted but still static (e.g., robots.txt, favicon.ico), you might use:
Cache-Control: public, max-age=86400
max-age=86400: Cache for one day. This is a reasonable balance for assets that might occasionally be updated but don’t require immediate reflection.
For dynamic content or HTML pages that should be fresh on every request (or revalidated):
Cache-Control: no-cache
This tells the CDN to cache the response but always check with the origin first using If-None-Match or If-Modified-Since. If the origin returns 304 Not Modified, the CDN serves its cached copy, saving bandwidth and origin load.
The Invisible ETag and Last-Modified Dance
When you use no-cache or when a max-age expires, the CDN needs to ask your origin if the resource has changed. It does this using ETag and Last-Modified headers.
ETag: An opaque identifier for a specific version of a resource.Last-Modified: The date and time the resource was last changed.
If your origin server doesn’t send these headers, the CDN will have to perform a full GET request every time it needs to revalidate, even if the file hasn’t changed. This is inefficient. Ensuring your origin server reliably sends ETag and Last-Modified headers is crucial for effective cache revalidation.
For example, when a max-age expires, the CDN might send:
GET /images/logo.png HTTP/1.1
Host: origin.example.com
If-Modified-Since: Tue, 15 Nov 2022 10:00:00 GMT
If-None-Match: "12345-abcdef"
If the file hasn’t changed, your origin responds with:
HTTP/1.1 304 Not Modified
Date: Tue, 15 Nov 2022 11:00:00 GMT
ETag: "12345-abcdef"
This 304 response tells the CDN its cached copy is still valid. The Age header on the CDN’s response to the end-user will then be reset or updated based on this revalidation.
The "Cache Busting" Fallacy and Why Versioning is Key
Many developers resort to "cache busting" techniques like appending a query parameter (/script.js?v=123) when they don’t want to deal with Cache-Control. This is often a mistake. Most CDNs and proxies are configured to ignore query parameters for caching purposes, or they cache each unique URL (e.g., /script.js?v=123 and /script.js?v=124 are treated as different resources). This forces a full download every time, destroying your cache hit rate. The correct approach is to rename the file itself (e.g., script.123.js) and set a long max-age.
The next thing you’ll need to understand is how to purge specific assets from the CDN’s cache when an urgent update is required.