CloudFormation Stack Drift means your live infrastructure doesn’t match the configuration defined in your CloudFormation templates, and CloudFormation lost track of the actual state.
Common Causes and Fixes for Stack Drift
1. Manual Changes via AWS Console or CLI:
- Diagnosis: Run
aws cloudformation detect-stack-drift --stack-name YourStackName. Examine the output for resources marked asDRIFTED. Then, useaws cloudformation describe-stack-drift-detection-status --stack-drift-detection-id <your-detection-id>to see which specific drift was detected. - Fix: For each drifted resource, you have two primary options:
- Drift Detection and Reconciliation: If the manual change was intentional and you want CloudFormation to adopt it, run
aws cloudformation detect-stack-set-drift --stack-set-name YourStackSetName(if using StackSets) oraws cloudformation detect-stack-drift --stack-name YourStackName. After detection, if the drift is acceptable, you can "re-drift" or "reset" the drift status for specific resources that you want CloudFormation to consider as the "current" state. This is done by updating the stack with the same template, which effectively tells CloudFormation to re-evaluate the actual state against the template. For example, if an EC2 instance’sInstanceTypewas manually changed, updating the stack with the original template will cause CloudFormation to register the currentt3.medium(or whatever it is) as the desired state if the template also specifiest3.medium. If the template specifiedt2.microand you want to keept3.medium, you must update the template first. - Rollback and Reapply: If the manual change was unintentional or incorrect, revert the change in the AWS Console/CLI. Then, run
aws cloudformation detect-stack-drift --stack-name YourStackNameagain. Once drift is no longer detected, you can proceed.
- Drift Detection and Reconciliation: If the manual change was intentional and you want CloudFormation to adopt it, run
- Why it works: CloudFormation operates on a declarative model. When you make changes outside of its direct management, the "desired state" (template) and the "actual state" (live resource) diverge. Re-running detection and then either updating the template to match the actual state or reverting the actual state to match the template re-aligns these two.
2. Resource Deletion Outside CloudFormation:
- Diagnosis:
aws cloudformation detect-stack-drift --stack-name YourStackName. Look for resources withDELETEDdrift status. - Fix: Update your CloudFormation template to remove the deleted resource. Then, update the stack using
aws cloudformation update-stack --stack-name YourStackName --template-body file://your-template.yaml. - Why it works: CloudFormation believes the resource should exist. When it’s gone, it flags drift. Removing the reference to the deleted resource in the template tells CloudFormation that the resource is no longer supposed to be managed by this stack, resolving the drift.
3. Resource Modification by Other Services (e.g., Auto Scaling, Lambda, EC2 Instance Connect):
- Diagnosis:
aws cloudformation detect-stack-drift --stack-name YourStackName. Identify resources withMODIFIEDdrift status. Usedescribe-stack-drift-detection-statusfor specifics. For example, an EC2 instance’sUserDatamight be changed by a new launch. - Fix: Examine the drift details to understand what changed. If the change is intentional and desired (e.g., an Auto Scaling group replacing an instance with updated user data), you might need to update your CloudFormation template to reflect the new configuration or accept the current state as the new baseline. If the change is undesirable, revert it manually and then re-run drift detection. If it’s an ongoing process (like
UserDataupdates on new launches), you may need to re-architect to avoid drift or accept that certain dynamic attributes will show as drifted. - Why it works: These services can modify resources independently. CloudFormation, by default, only knows about the state it last set. When another service alters it, drift occurs. Reconciling the template or accepting the new state brings them back in sync.
4. Resource Tags Added/Removed Manually or by Other Services:
- Diagnosis:
aws cloudformation detect-stack-drift --stack-name YourStackName. Resources withMODIFIEDdrift status, and the drift details will specifically list tag changes. - Fix: If the tags were added/removed intentionally, update your CloudFormation template to include or exclude those tags. Then, update the stack. If the tags were added/removed by another automated process (e.g., cost allocation tagging tools), you might need to configure those tools to respect CloudFormation-managed tags or exclude those resources from their tagging.
- Why it works: Tags are attributes of resources, and CloudFormation tracks them. Any discrepancy between the template’s defined tags and the actual tags on the resource will be reported as drift.
5. Resource Properties Modified by CloudFormation Itself (e.g., during updates):
- Diagnosis: This is less about external drift and more about CloudFormation’s state management. Running
detect-stack-driftwill showMODIFIEDfor resources where properties changed during a previous CloudFormation update. The key is that the template itself might not have changed, but CloudFormation’s internal understanding of the resource’s state might have been updated. - Fix: This usually indicates that CloudFormation successfully updated a resource, and the new state is now reflected in its managed properties. Often, no immediate action is needed if the update was intentional and successful. However, if you then manually change a property that CloudFormation just updated, you’ll get drift. The fix is to either revert the manual change or update your template to match.
- Why it works: CloudFormation updates its internal record of a resource’s properties when it performs an update. If you then manually change a property that CloudFormation just updated, the actual state diverges from CloudFormation’s current recorded state (which is now the desired state for subsequent operations).
6. Drift Detection Configuration Issues:
- Diagnosis: If
detect-stack-driftis not returning expected results or is slow, check theDriftDetectionStatusandDriftDetectionTimeindescribe-stack-drift-detection-status. Ensure the drift detection is actually running and completing. - Fix: Manually trigger a new drift detection by running
aws cloudformation detect-stack-drift --stack-name YourStackName. If you are using CloudFormation StackSets, ensure drift detection is enabled for the stack set and that the execution role has the necessary permissions. - Why it works: Drift detection is an asynchronous operation. If it fails to start or complete, no drift status will be reported. Re-initiating it ensures the process runs correctly.
The next error you’ll likely encounter after fixing drift is a CloudFormation stack update failing because a resource is now in a ROLLBACK_COMPLETE state due to a previous failed update attempt.