Pub/Sub-triggered Cloud Functions don’t actually "retry" in the way you’re probably thinking; instead, Pub/Sub redelivers messages until your function acknowledges them, and your function configuration dictates how many times Pub/Sub will try before giving up.

Let’s see this in action. Imagine you have a Pub/Sub topic named new-orders and a Cloud Function process-order subscribed to it. When a message arrives on new-orders, Pub/Sub sends it to process-order. If process-order finishes without sending an acknowledgment (ACK) back to Pub/Sub within a certain timeframe, Pub/Sub assumes it failed and will try again.

Here’s a simplified Python snippet for a Cloud Function that doesn’t ACK:

import base64
import json

def process_order(event, context):
    """Triggered by a Pub/Sub message."""
    message_data = base64.b64decode(event['data']).decode('utf-8')
    order_info = json.loads(message_data)

    print(f"Received order: {order_info['order_id']}")

    # *** CRITICAL: No acknowledgment sent here! ***
    # This will cause Pub/Sub to redeliver the message.
    # For demonstration, we'll just let it time out.
    # In a real function, you'd ACK like this:
    # return 'ACK'

If this function is deployed, Pub/Sub will keep sending the same message over and over. The retry policy is what controls how long this redelivery dance continues before Pub/Sub declares the message "unrecoverable" for this subscription.

The core problem Pub/Sub’s retry policy solves is ensuring that messages are eventually processed even if your Cloud Function temporarily fails due to transient errors (network glitches, service outages, bugs in your code that are quickly fixed). It’s a mechanism for achieving at-least-once delivery semantics.

Internally, Pub/Sub tracks the delivery attempts for each message on a subscription. When a message is published, it’s assigned a "delivery attempt count" of 1. If your function successfully processes it and ACKs, that count is discarded. If it fails to ACK (either by returning an error, timing out, or explicitly NACKing), Pub/Sub increments the delivery attempt count and schedules a redelivery. The retry policy defines the upper limit for this count and the duration Pub/Sub will wait between redeliveries.

You configure this retry policy at the subscription level, not directly on the Cloud Function itself. When you create a Pub/Sub subscription (which is implicitly created when you set up a Pub/Sub-triggered Cloud Function in the console or via gcloud), you can specify these parameters.

The key parameters are:

  • Maximum delivery attempts: The maximum number of times Pub/Sub will attempt to deliver a message to subscribers of this subscription. Once this limit is reached, Pub/Sub will stop redelivering the message and it will be moved to a dead-letter topic (if configured) or dropped.
  • Message retention duration: This is not part of the retry policy directly, but it’s crucial. It defines how long Pub/Sub keeps messages on the subscription before discarding them if they are never ACKed or if the retry attempts are exhausted. If retention is too short, messages might be lost before they can be retried sufficiently.
  • Expiration policy: This affects message expiration. The retry policy is primarily about redelivery attempts.

Let’s say you want to ensure a message is retried up to 5 times, and you want Pub/Sub to wait at least 10 seconds between retries. You’d configure this on the subscription.

Using gcloud, you can set this when creating or updating a subscription:

# Example: Creating a subscription with retry settings
gcloud pubsub subscriptions create my-function-subscription \
  --topic=new-orders \
  --ack-deadline=10s \
  --message-retention-duration=7d \
  --max-delivery-attempts=5

# Example: Updating an existing subscription
gcloud pubsub subscriptions update my-function-subscription \
  --max-delivery-attempts=5

In this example:

  • --ack-deadline=10s: This is the time your function has to process and ACK a message per attempt. If it takes longer than 10 seconds, Pub/Sub will assume failure and redeliver. This is distinct from the delay between retries.
  • --max-delivery-attempts=5: Pub/Sub will try delivering the message up to 5 times. On the 6th "failure" (i.e., no ACK), it stops.
  • --message-retention-duration=7d: Messages are kept for 7 days, even if they fail all retries.

The actual delay between retries isn’t a direct configuration parameter you set on the subscription. Pub/Sub uses an exponential backoff algorithm. This means the delay between redeliveries increases with each failed attempt. It starts small (e.g., a few seconds) and grows, typically capping out at a maximum delay (e.g., 60 seconds or more, depending on Pub/Sub’s internal logic) before the next retry, up to the max-delivery-attempts limit. This prevents overwhelming your function with rapid-fire redeliveries immediately after a failure.

What most people miss is that the ack-deadline on the subscription is the per-attempt timeout. If your function needs more than, say, 10 seconds to process a single message reliably, you must increase this deadline. If your function takes minutes, Pub/Sub will keep redelivering it every 10 seconds until the max-delivery-attempts is hit, which is usually not what you want. You can also extend the ack deadline programmatically within your function if it knows it needs more time, but this is generally an anti-pattern; configure the subscription appropriately.

The next thing you’ll grapple with is handling messages that still fail after all retry attempts, which is where dead-letter topics become essential.

Want structured learning?

Take the full Cloud-functions course →