Serverless functions, at their core, are event-driven automatons, but the systems that trigger them are often more complex than they appear.
Let’s see this in action. Imagine a new file lands in an S3 bucket. We want to process that file.
AWS Lambda with S3 Trigger
Here’s a typical setup in AWS:
{
"FunctionName": "my-s3-processor",
"Runtime": "python3.9",
"Role": "arn:aws:iam::123456789012:role/lambda_execution_role",
"Handler": "index.handler",
"Code": {
"S3Bucket": "my-code-bucket",
"S3Key": "my_function.zip"
},
"Events": [
{
"S3": {
"Bucket": "my-data-bucket",
"Events": [
"s3:ObjectCreated:*"
]
}
}
]
}
When my-data-bucket receives a new object, AWS Lambda polls S3 for new events. If an s3:ObjectCreated:* event is detected, Lambda invokes my-s3-processor and passes the event details as a payload. The index.handler function in my_function.zip then executes.
Google Cloud Functions with Cloud Storage Trigger
In GCP, it looks like this:
name: projects/my-gcp-project/functions/my-storage-processor
runtime: nodejs16
entryPoint: processFile
sourceArchiveBucket: gs://my-source-bucket
sourceArchiveObject: source.zip
eventTrigger:
eventType: google.storage.object.finalize
resource: gs://my-data-bucket
Here, Cloud Functions subscribes to Cloud Storage events. When an object is finalized (created or overwritten) in my-data-bucket, a message is published to a Pub/Sub topic (managed internally by GCP). Cloud Functions then picks up this message and triggers processFile with the event data.
Azure Functions with Blob Storage Trigger
Azure has a similar pattern:
{
"scriptFile": "run.py",
"entryPoint": "main",
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"direction": "in",
"path": "my-data-container/{name}",
"connection": "AzureWebJobsStorage"
}
]
}
This function.json defines a blob trigger. When a new blob is created or updated in my-data-container, the Azure Functions runtime automatically binds the blob’s content to the myblob parameter in the main function of run.py. The {name} in the path acts as a variable to capture the blob’s name.
The core problem these services solve is decoupling compute from storage or messaging systems. Instead of a service constantly polling for changes, the storage or messaging system notifies the compute service when something happens. This is far more efficient and scalable.
Internally, each cloud provider has a sophisticated event routing mechanism. For S3 and Blob Storage, it often involves a polling agent or a push notification from the storage service to the serverless platform. For Pub/Sub and SQS, it’s a direct message delivery. The serverless platform then manages a pool of "warm" function instances ready to execute or spins up new ones as needed.
The key levers you control are the event source (e.g., which bucket, which topic), the event type (e.g., create, delete, publish), and the function that gets invoked. You also configure the permissions necessary for the serverless function to read from or write to other services.
A common point of confusion is understanding the difference between event sources that push events (like Pub/Sub or SQS) and those that poll or are polled by the serverless platform (like S3 or Blob Storage). While the end result for your function code is similar—receiving event data—the underlying mechanism and potential latency can differ. For instance, S3 event notifications are typically delivered within seconds, but there can be a small delay as the serverless platform detects and processes them. Pub/Sub and SQS offer near-instantaneous message delivery to the serverless platform.
The next hurdle is often managing the state and idempotency of your event-driven functions when multiple events might trigger them or when retries occur.