CloudFormation custom resources let you manage anything with CloudFormation, not just AWS resources.
Let’s say you need to provision a third-party service, run some complex logic before or after a resource is created, or even interact with an older API that CloudFormation doesn’t natively support. That’s where custom resources shine. They act as a bridge, allowing you to define custom actions within your CloudFormation stack.
Here’s a simple example. We want to create a custom resource that registers a domain name with a hypothetical DomainRegistrar service.
Resources:
MyDomainRegistration:
Type: Custom::DomainRegistration
Properties:
ServiceToken: !GetAtt DomainRegistrationLambda.Arn
DomainName: example.com
RegistrantEmail: admin@example.com
DomainRegistrationLambda:
Type: AWS::Lambda::Function
Properties:
Handler: index.handler
Role: !GetAtt LambdaExecutionRole.Arn
Runtime: python3.9
Code:
ZipFile: |
import json
import logging
import cfnresponse # This is crucial!
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def handler(event, context):
response_data = {}
physical_resource_id = event.get('PhysicalResourceId')
try:
request_type = event['RequestType']
props = event['ResourceProperties']
domain_name = props['DomainName']
registrant_email = props['RegistrantEmail']
logger.info(f"Received request: {request_type} for {domain_name}")
if request_type == 'Create':
# Simulate domain registration
logger.info(f"Registering domain: {domain_name} for {registrant_email}")
# In a real scenario, you'd call the third-party API here
# The PhysicalResourceId should be unique and stable for the resource
physical_resource_id = f"domain-{domain_name}"
response_data['RegistrationId'] = 'reg-12345abc'
cfnresponse.send(event, context, cfnresponse.SUCCESS, response_data, physical_resource_id)
elif request_type == 'Update':
# Simulate updating domain registration (e.g., changing email)
logger.info(f"Updating domain: {domain_name} with new email {registrant_email}")
# Update logic here
response_data['RegistrationId'] = 'reg-12345abc' # Should be the existing ID
cfnresponse.send(event, context, cfnresponse.SUCCESS, response_data, physical_resource_id)
elif request_type == 'Delete':
# Simulate deregistering domain
logger.info(f"Deregistering domain: {domain_name} (PhysicalResourceId: {physical_resource_id})")
# Deregistration logic here
cfnresponse.send(event, context, cfnresponse.SUCCESS, {}, physical_resource_id)
except Exception as e:
logger.error(f"Failed to process request: {e}")
cfnresponse.send(event, context, cfnresponse.FAILED, {'Error': str(e)}, physical_resource_id)
LambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: LambdaLoggingPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
The Custom::DomainRegistration resource is the star here. Its Type starts with Custom::, signaling to CloudFormation that this is a custom resource. The ServiceToken property is critical; it points to the ARN of the Lambda function that will handle the actual logic. CloudFormation invokes this Lambda function for Create, Update, and Delete operations.
The Lambda function receives an event object containing all the details: the RequestType (Create, Update, Delete), ResourceProperties passed from your CloudFormation template, and PhysicalResourceId. The cfnresponse module is the key to communicating back to CloudFormation. You must call cfnresponse.send() with the correct status (SUCCESS or FAILED) and any responseData or PhysicalResourceId. The PhysicalResourceId is a unique identifier for the instance of your custom resource, and it’s crucial for updates and deletes. If you don’t provide one on Create, CloudFormation will assign one, but it’s best practice to generate a stable, meaningful one yourself.
The most surprising true thing about Lambda-backed custom resources is that CloudFormation doesn’t actually know what your custom resource is doing. It only cares about the signals from your Lambda function. If your Lambda function takes too long to respond, or fails to send a success signal, CloudFormation will time out and mark the resource as CREATE_FAILED or DELETE_FAILED.
When you define a Custom::MyResource in CloudFormation, here’s the lifecycle:
- Create: CloudFormation sends a
Createevent to your Lambda function. Your function performs the creation logic and callscfnresponse.send(..., SUCCESS, ..., physicalResourceId). CloudFormation then marks the resource asCREATE_COMPLETE. If it fails, it callscfnresponse.send(..., FAILED, ...)and the stack creation fails. - Update: If properties change, CloudFormation sends an
Updateevent. Your function handles the update logic and callscfnresponse.send(..., SUCCESS, ..., physicalResourceId). CloudFormation marks it asUPDATE_COMPLETE. IfPhysicalResourceIdchanges during an update, CloudFormation treats it as a delete of the old resource and a create of a new one. - Delete: When the resource is removed from the template or the stack is deleted, CloudFormation sends a
Deleteevent. Your function performs cleanup and callscfnresponse.send(..., SUCCESS, ..., physicalResourceId). CloudFormation marks it asDELETE_COMPLETE.
The cfnresponse module is the unsung hero. It abstracts away the direct API calls to CloudFormation’s service endpoint that your Lambda function needs to make to signal completion. Without it, you’d be manually constructing signed requests to a specific ResponseURL provided in the event.
The one thing most people don’t realize is how important the PhysicalResourceId is for idempotency and state management. If your Lambda function is called twice with the same Create event due to a retry by CloudFormation (though rare, it can happen), and your function doesn’t use a stable PhysicalResourceId and check if it already exists, you might end up creating duplicate resources or corrupting state. Always ensure your PhysicalResourceId is unique and immutable for a given logical resource.
The next concept you’ll likely encounter is handling complex dependencies and asynchronous operations within your custom resources, especially when interacting with services that have long provisioning times.