The fundamental difference between CDN origin pull and origin push isn’t about where the content lives, but who decides when it gets there.

Let’s see it in action. Imagine a new blog post is published on your-awesome-blog.com.

With Origin Pull, a user in London requests that post. Their browser hits the nearest CDN edge server. If the CDN doesn’t have a cached copy of that blog post, it pulls the content directly from your origin server (your-awesome-blog.com) and then serves it to the London user. Crucially, the CDN only fetches the content when it’s actually requested by an end-user.

# Simulate a cache miss on an edge server (conceptually)
# The edge server's internal cache (e.g., in memory or on disk)
# doesn't contain the requested asset: /blog/new-post.html

# The edge server then makes an HTTP GET request to your origin
# (This is the 'pull' action)
GET /blog/new-post.html HTTP/1.1
Host: your-awesome-blog.com
User-Agent: CDN-Edge-Server/1.0

# Your origin server responds with the content
HTTP/1.1 200 OK
Content-Type: text/html
Cache-Control: public, max-age=3600
Content-Length: 12345
... (HTML content of the blog post) ...

# The edge server caches this response for future requests
# and serves it to the London user.

This is the default for most CDNs. It’s simple: the CDN acts as an on-demand proxy.

Now, consider Origin Push. Here, you explicitly send content to the CDN before users request it. Think of it like pre-stocking your shelves at a distribution center. You upload your new-post.html directly to a CDN storage bucket. The CDN then makes this content available from its edge locations.

# You upload the content to a CDN storage location (e.g., an S3 bucket)
# using a CLI tool or API.
aws s3 cp /path/to/your-awesome-blog.com/blog/new-post.html \
  s3://your-cdn-storage-bucket/blog/new-post.html \
  --acl public-read

# The CDN's infrastructure then propagates this object to its edge servers.
# When a user in London requests it:
# The edge server retrieves the *already cached* asset from its local storage.
# It does NOT need to contact your origin server.

The primary problem Origin Pull solves is efficiently distributing static assets (images, CSS, JavaScript, videos) across a global network of servers. Instead of every user hitting your single origin server, they hit the closest CDN edge, dramatically reducing latency and offloading traffic from your infrastructure. Origin Push is typically used for scenarios where you have large, infrequently changing files that you want to ensure are present on the CDN before any user requests them, or when your origin server is not directly accessible from the public internet.

The core mechanism for Origin Pull is cache invalidation and revalidation. When the content on your origin server changes, you need to tell the CDN to forget its old copy. This can be done via TTL (Time To Live) expiration or explicit purging. For Origin Push, the mechanism is about synchronization and versioning. You’re responsible for uploading new versions, and the CDN provides mechanisms to manage these versions and ensure they are distributed.

The one thing most people don’t realize is that "origin push" isn’t a single, universally implemented feature. Some CDNs offer it as a distinct service where you upload directly to their storage. Others achieve a similar effect through advanced features like "origin shielding" (where a group of edge servers pulls from a single, central CDN point rather than directly from your origin) or by using features that allow you to upload directly to their global object storage. The key is that you are initiating the transfer to the CDN’s infrastructure, rather than the CDN initiating the pull from your origin.

The next logical step is understanding how to manage cache invalidation effectively in an Origin Pull model.

Want structured learning?

Take the full Cdn course →