Cloud Run’s Pub/Sub push subscriptions are a powerful way to build event-driven architectures, but understanding how they really work is key to avoiding subtle pitfalls. The most surprising thing about them is that Cloud Run isn’t just passively receiving messages; it’s actively polling Pub/Sub for new work, even though it feels like Pub/Sub is pushing to you.

Let’s see this in action. Imagine a simple Python Flask app deployed to Cloud Run, configured to receive Pub/Sub messages.

import base64
import json
import os

from flask import Flask, request

app = Flask(__name__)

@app.route('/', methods=['POST'])
def index():
    envelope = request.get_json()
    if not envelope:
        msg = "no Pub/Sub message received"
        print(f"error: {msg}")
        return f"Bad Request: {msg}", 400

    pubsub_message = envelope.get('message')

    if not pubsub_message:
        msg = "no Pub/Sub message property in envelope"
        print(f"error: {msg}")
        return f"Bad Request: {msg}", 400

    if pubsub_message.get('data'):
        data = base64.b64decode(pubsub_message['data']).decode('utf-8')
        print(f"Received message: {data}")
        # Process your data here
    else:
        print("No data in message")

    return '', 204 # Acknowledge the message

if __name__ == '__main__':
    port = int(os.environ.get('PORT', 8080))
    app.run(host='0.0.0.0', port=port)

When you create a Pub/Sub push subscription pointing to this Cloud Run service, Pub/Sub doesn’t continuously stream messages to your service. Instead, Pub/Sub maintains a lease on messages. When your Cloud Run service is available, Pub/Sub will deliver messages to its endpoint. The key is that Cloud Run itself is configured with an auto-scaling mechanism. When there are no incoming requests (no Pub/Sub messages to process), Cloud Run scales down to zero instances. When Pub/Sub has messages, it initiates a request to your Cloud Run service’s URL. This request acts as a trigger, causing Cloud Run to spin up an instance (if none are running) to handle the incoming request.

The fundamental problem Pub/Sub push subscriptions solve is bridging the gap between a reliable, durable message queue (Pub/Sub) and a scalable, ephemeral compute platform (Cloud Run). Pub/Sub guarantees delivery, but it needs an endpoint. Cloud Run provides that endpoint, but it needs a trigger to start processing. The push subscription is that trigger. Pub/Sub will attempt to deliver messages to the specified URL, and Cloud Run, upon receiving that HTTP request, will spin up an instance if necessary.

The core components to understand are:

  • Pub/Sub Topic: Where messages are published.
  • Pub/Sub Subscription: A named resource attached to a topic, representing a stream of messages to be delivered. For push subscriptions, this is where you configure the endpoint.
  • Cloud Run Service: Your deployed containerized application, which will act as the push endpoint.
  • Push Endpoint URL: The URL of your Cloud Run service, configured in the subscription.
  • Message Acknowledgement: After processing a message, your Cloud Run service must respond with a 200 OK or 204 No Content status code. This tells Pub/Sub that the message was successfully processed and can be acknowledged (removed from the subscription’s backlog). A non-2xx response signals a failure, and Pub/Sub will attempt to redeliver the message.

The configuration in the Pub/Sub subscription is critical. When creating or updating a subscription, you’ll set the "Push endpoint" to your Cloud Run service’s URL. You also configure a "Push authentication" method. For Cloud Run, this is typically "Add OIDC token," which allows Pub/Sub to securely authenticate its requests to your service.

A common point of confusion arises from the fact that Cloud Run can scale down to zero. If your subscription has messages waiting but your Cloud Run service is scaled to zero, Pub/Sub will repeatedly try to hit the endpoint. The first request will trigger an instance to start. This "cold start" latency is inherent to serverless platforms. Pub/Sub’s retry mechanism ensures that messages aren’t lost if an instance isn’t immediately available. The retry policy on the subscription determines how long Pub/Sub will keep retrying a failed delivery.

What most people don’t realize is how Pub/Sub’s message leases interact with Cloud Run’s scaling. Pub/Sub doesn’t just send a message and forget it. It grants a temporary "lease" for a message. If your Cloud Run service acknowledges the message within the lease duration (typically 60 seconds by default, configurable up to 10 minutes), Pub/Sub considers it delivered. If the lease expires before acknowledgement, Pub/Sub assumes the delivery failed and will attempt to redeliver it to another instance or at a later time. This lease mechanism is why it’s crucial for your Cloud Run service to acknowledge messages promptly. If your processing takes longer than the lease, you’ll need to increase the lease duration in the subscription settings.

The next hurdle you’ll likely encounter is managing message ordering and idempotency in a distributed, event-driven system where retries are common.

Want structured learning?

Take the full Cloud-run course →