Monitor AWS Lambda Cold Starts with Elastic APM (2026)

AWS Lambda cold starts aren’t just about latency; they’re a direct signal that your function’s execution environment is being re-provisioned, which can be a symptom of underlying resource contention or inefficient scaling.

Let’s see this in action. Imagine we have a Python Lambda function that queries a PostgreSQL database. We’ll instrument it with the Elastic APM Python agent.

# lambda_function.py
from elasticapm import capture_span, Client
from elasticapm.contrib.wsgi import WsgiMiddleware
import os
import psycopg2
import time

# Initialize Elastic APM client
client = Client({
    'SERVICE_NAME': 'my-lambda-service',
    'SERVER_URL': os.environ['ELASTIC_APM_SERVER_URL'],
    'ENVIRONMENT': 'dev',
    'SECRET_TOKEN': os.environ.get('ELASTIC_APM_SECRET_TOKEN')
})

def lambda_handler(event, context):
    start_time = time.time()

    # Capture the cold start span if it's the first invocation
    if not hasattr(lambda_handler, "initialized"):
        capture_span("cold_start", "lambda", start_time=start_time)
        lambda_handler.initialized = True

    with capture_span("database_query", "db"):
        try:
            conn = psycopg2.connect(
                dbname="mydb",
                user="myuser",
                password="mypassword",
                host="my-rds-instance.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com",
                port="5432"
            )
            cur = conn.cursor()
            cur.execute("SELECT version();")
            db_version = cur.fetchone()
            print(f"Database version: {db_version}")
            cur.close()
            conn.close()
        except Exception as e:
            client.capture_exception()
            return {"statusCode": 500, "body": f"Error querying database: {e}"}

    end_time = time.time()
    duration = end_time - start_time
    print(f"Lambda execution time: {duration:.4f} seconds")

    return {
        "statusCode": 200,
        "body": "Database query successful!"
    }

When this Lambda function is invoked for the first time after a period of inactivity, the APM agent will automatically create a cold_start span. This span encompasses the time taken for the Lambda runtime to initialize the execution environment, load your code, and initialize the APM agent itself. Subsequent invocations within the warm execution environment will not generate this cold_start span, allowing you to differentiate between the two.

The problem this solves is understanding the true cost of serverless first-invocation latency. Developers often focus on the application code’s execution time, but the infrastructure spin-up time can be a significant, and often invisible, portion of the total request duration. Elastic APM makes this hidden cost visible by explicitly capturing the cold_start span.

Internally, the Elastic APM Python agent uses a simple flag (lambda_handler.initialized in this example) to detect the first invocation. During initialization, it registers hooks into the Python runtime and AWS Lambda’s execution context. When lambda_handler is called for the very first time, the hasattr check fails, and we manually capture_span with a fixed start time (start_time=start_time) to mark the beginning of the cold start. This span is then automatically closed when the lambda_handler function exits. For subsequent calls, the flag is True, and the cold_start span is not created.

The key levers you control are:

SERVICE_NAME: Identifies your Lambda function in Elastic APM.
SERVER_URL: The endpoint of your Elastic APM server.
ENVIRONMENT: Helps segment your data (e.g., 'dev', 'staging', 'prod').
SECRET_TOKEN: For authenticated APM server connections.
Instrumentation: Where you place capture_span calls to delineate specific operations within your Lambda.

One aspect often overlooked is that the APM agent initialization itself contributes to the cold start duration. If your APM agent requires complex configuration or network lookups during its first-party initialization, this time is bundled into the cold_start span. For languages with more explicit initialization phases or for agents that perform background setup, this contribution can be more pronounced. Understanding this means you might consider optimizing agent configuration or even deferring some agent setup if possible, though for most common Lambda use cases, the default agent initialization is highly optimized.

The next concept to explore is how to correlate Lambda cold starts with other AWS service metrics like CloudWatch logs and X-Ray traces.