Fan-out patterns, where a single event triggers multiple parallel executions of a Cloud Function, are a powerful way to scale asynchronous processing, but they often mask a hidden bottleneck: the sequential nature of their triggering.

Let’s see this in action. Imagine we have a bucket of images in Google Cloud Storage, and we want to generate a thumbnail for each. A common approach is to trigger a Cloud Function on object creation.

# Example Cloud Storage trigger configuration
cloud_functions:
  - name: generate-thumbnail
    runtime: python39
    entry_point: handler
    event_trigger:
      event_type: google.storage.object.finalize
      resource: projects/_/buckets/my-image-bucket

When a new image lands in my-image-bucket, the generate-thumbnail function is invoked. If we have 100 images uploaded simultaneously, Cloud Functions will try to spin up 100 separate invocations. This is the "fan-out" part.

However, what if the source of these uploads isn’t a single file drop, but a batch job that itself iterates and calls the Cloud Function?

Consider a scenario where a batch process needs to process 1000 records. A naive implementation might look like this:

# Example of a sequential fan-out (BAD!)
import google.cloud.functions_v1 as functions_v1

client = functions_v1.CloudFunctionsServiceClient()
project_id = "my-gcp-project"
location = "us-central1"
function_name = "process-record"

for record_id in range(1000):
    payload = {"record_id": record_id}
    # This client.call_function is synchronous and sequential
    response = client.call_function(
        name=f"projects/{project_id}/locations/{location}/functions/{function_name}",
        data=payload
    )
    print(f"Processed record {record_id} with result: {response.result}")

In this code, each client.call_function completes before the next one starts. Even though the process-record function itself might be designed to run in parallel, the triggering mechanism is a sequential bottleneck. If each function call takes 5 seconds, processing 1000 records this way takes over an hour (1000 * 5s = 5000s ≈ 1.4 hours).

The problem we’re solving is how to efficiently trigger multiple instances of a Cloud Function in parallel, especially when the source of the work is a large dataset or a batch job. The goal is to maximize the throughput of our parallel processing.

The internal mechanism relies on the fact that Cloud Functions can be invoked via HTTP or directly through Pub/Sub. When using Pub/Sub, the fan-out is handled by Pub/Sub itself, which can deliver messages to multiple subscribers (including Cloud Functions) concurrently.

To achieve true parallel triggering, we need to change how the batch job initiates the work. Instead of calling the function sequentially, we’ll push messages to a Pub/Sub topic, and have the Cloud Function subscribe to that topic.

Here’s how to restructure the batch job:

First, create a Pub/Sub topic (e.g., process-records-topic).

gcloud pubsub topics create process-records-topic --project=my-gcp-project

Then, configure your Cloud Function to be triggered by this topic.

# Example Cloud Functions configuration with Pub/Sub trigger
cloud_functions:
  - name: process-record
    runtime: python39
    entry_point: handler
    event_trigger:
      event_type: google.pubsub.topic.publish
      resource: projects/my-gcp-project/topics/process-records-topic

Now, modify the batch job to publish messages to the topic:

# Example of a parallel fan-out using Pub/Sub (GOOD!)
from google.cloud import pubsub_v1

publisher = pubsub_v1.PublisherClient()
project_id = "my-gcp-project"
topic_name = "process-records-topic"
topic_path = publisher.topic_path(project_id, topic_name)

for record_id in range(1000):
    payload = {"record_id": record_id}
    # Pub/Sub publishing is asynchronous and batched by the client library
    data = str(payload).encode("utf-8")
    future = publisher.publish(topic_path, data)
    # You can optionally handle futures for acknowledgments or error checking
    # future.add_done_callback(lambda f: print(f"Published message: {f.result()}"))

print("All messages published to Pub/Sub.")

When the batch job runs, it rapidly publishes 1000 messages to process-records-topic. Pub/Sub then takes over, distributing these messages to available process-record Cloud Function instances. Cloud Functions, seeing multiple messages arriving on its Pub/Sub subscription, will scale up automatically to handle the load concurrently. This dramatically reduces the overall processing time. Instead of 1000 * 5s = 5000s, if Cloud Functions can spin up 100 instances in parallel, and each message takes 5s, the total time becomes closer to (1000 messages / 100 instances) * 5s = 50s.

The key levers you control are:

  • Pub/Sub Topic Configuration: The number of subscribers to a topic doesn’t directly limit publishing speed, but the throughput of your publisher and the number of messages you can batch in a single publish() call are important.
  • Cloud Function Trigger Configuration: Ensuring the event_type is google.pubsub.topic.publish and the resource points to your topic is fundamental.
  • Cloud Function Concurrency Settings: While Pub/Sub handles the distribution of messages, the max_instances setting on your Cloud Function dictates how many parallel invocations can run simultaneously. This should be tuned based on your function’s resource needs and GCP quotas.
  • Message Payload Size: Larger payloads can impact network transfer times and memory usage within the function. Keep them as small as practical.

The trickiest part of Pub/Sub-triggered fan-out is often managing message ordering and exactly-once processing guarantees. Pub/Sub guarantees at-least-once delivery by default, meaning a message might be delivered more than once. If your function is not idempotent (meaning running it multiple times with the same input has the same effect as running it once), you’ll need to implement deduplication logic within your process-record function itself, typically by tracking processed message_ids or unique record identifiers in a database.

The next hurdle you’ll likely encounter is managing the cost implications of massively scaled Cloud Functions and the potential for hitting concurrency quotas.

Want structured learning?

Take the full Cloud-functions course →