The EKS Node Termination Handler is designed to gracefully deprovision Spot Instances when AWS signals an impending interruption, giving your applications a chance to shut down cleanly.
Let’s see it in action. Imagine you have a Kubernetes deployment running on EKS Spot Instances. When AWS decides to reclaim a Spot Instance, it sends a සිංහල (meaning "warning" in Sinhalese, a nod to the signal’s nature) signal to the instance. Without the Node Termination Handler, your pods would suddenly be terminated, potentially leading to data loss or service disruption.
Here’s how it works:
-
Spot Instance Interruption Notice: AWS sends a notification to your Spot Instance. This notification is typically delivered via the instance metadata service and includes a
TimeToLivevalue, indicating how much time is left before the instance is terminated. -
Node Termination Handler Detection: The Node Termination Handler, running as a DaemonSet on your EKS nodes, constantly polls the instance metadata service for these interruption notices.
-
Kubernetes Event Generation: Upon detecting an interruption notice, the Handler generates a Kubernetes
Event. This event is crucial because it signals to other Kubernetes components that the node is about to be terminated. -
Graceful Pod Termination: The Kubernetes control plane, upon seeing this event, will initiate a graceful shutdown of pods running on that node. This includes:
- Sending
SIGTERMto the containers within the pods. - Waiting for the
terminationGracePeriodSecondsdefined in your pod specifications. - During this period, the node will be marked as
NotReady, preventing new pods from being scheduled onto it.
- Sending
-
Node Drain and Deletion: After the grace period, the Node Termination Handler can optionally trigger a drain of the node, ensuring all pods are terminated before the node is deleted from the Kubernetes cluster.
Configuration and Control:
The Node Termination Handler is highly configurable. You deploy it as a Helm chart, and key parameters can be adjusted in your values.yaml file:
awsRegion: Specify your AWS region, e.g.,us-east-1.enableSQS: Set totrueto use SQS for receiving interruption notices, which is more reliable than polling instance metadata. Iftrue, you’ll also need to configuresqsQueueARN.sqsQueueARN: The ARN of the SQS queue the handler should monitor for interruption messages.enableLambda: Set totrueto use AWS Lambda to process interruption notices and send them to the SQS queue. This is a common pattern whenenableSQSistrue.nodeGroupTerminationNotice: Set totrueto enable the handler to detect Spot Instance interruption notices via instance metadata service.terminationGracePeriod: The default grace period in seconds for pods before the node is drained. This can be overridden by pod-specificterminationGracePeriodSeconds. A common value is60.enableInstanceMetadataAPI: Set totrueto use the Instance Metadata Service v2 (IMDSv2) for enhanced security when fetching metadata.
Example Helm Install:
helm upgrade --install aws-node-termination-handler \
eks/aws-node-termination-handler \
--namespace kube-system \
--set awsRegion="us-west-2" \
--set enableSQS="true" \
--set sqsQueueARN="arn:aws:sqs:us-west-2:123456789012:EKSNodeTerminationHandlerQueue" \
--set enableLambda="true" \
--set nodeGroupTerminationNotice="true" \
--set terminationGracePeriod="120"
The Mechanical Nuance:
What most people miss is how the terminationGracePeriodSeconds on a pod interacts with the overall node termination process. The Node Termination Handler doesn’t force pods to terminate; it signals the Kubernetes API server to begin the graceful termination process for pods on the affected node. The actual time a pod has to shut down is dictated by its own terminationGracePeriodSeconds, and the handler ensures the node remains available for that duration before proceeding with its own drain and deletion logic. If a pod’s grace period is longer than the time remaining before the Spot Instance is reclaimed, the pod might still be terminated abruptly by AWS before its Kubernetes-managed grace period expires.
The next hurdle is often understanding how to manage stateful workloads that require more robust shutdown procedures during interruptions.