Taints and tolerations in EKS are Kubernetes mechanisms that allow you to repel or attract pods to specific nodes, effectively controlling where your workloads run.

Let’s see this in action. Imagine you have a set of GPU-enabled nodes in your EKS cluster that you only want specific machine learning pods to use. You wouldn’t want your standard web server pods accidentally landing on those expensive GPU nodes.

First, we’ll taint the GPU nodes. You do this by adding a taint to the node object. This taint is a key-value pair with a "taint effect." The effect dictates what happens to pods that don’t "tolerate" this taint.

Here’s how you’d add a taint to a node named ip-10-0-1-100.ec2.internal using kubectl:

kubectl taint nodes ip-10-0-1-100.ec2.internal gpu=true:NoSchedule

In this command:

  • gpu=true is the key-value pair.
  • NoSchedule is the taint effect. This effect means that no pod will be scheduled onto this node unless it has a matching toleration. Other effects include PreferNoSchedule (Kubernetes will try not to schedule pods here) and NoExecute (pods already running on the node will be evicted if they don’t tolerate the taint).

Now, if you try to deploy a pod without a matching toleration to a node with this taint, Kubernetes will refuse to schedule it.

To allow specific pods to run on these tainted nodes, you add a "toleration" to the pod’s specification. This is done within the pod’s .spec.tolerations field.

Here’s a sample Deployment YAML that tolerates the gpu=true:NoSchedule taint:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpu-workload
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gpu-app
  template:
    metadata:
      labels:
        app: gpu-app
    spec:
      containers:
      - name: gpu-container
        image: nvidia/cuda:11.0-base
        resources:
          limits:
            nvidia.com/gpu: 1 # Requesting a GPU
      tolerations:
      - key: "gpu"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"

In this pod spec:

  • key: "gpu" matches the key of the taint.
  • operator: "Equal" means the value must be exactly matched. Other operators include Exists (which matches if the key exists, regardless of value).
  • value: "true" matches the value of the taint.
  • effect: "NoSchedule" matches the effect of the taint.

When you apply this Deployment, Kubernetes will see that the gpu-workload pods have a toleration for the gpu=true:NoSchedule taint and will happily schedule them onto the tainted GPU nodes. Other pods without this toleration will be prevented from being scheduled on those nodes.

This mechanism is crucial for efficiently managing diverse hardware in your cluster. You can dedicate nodes for specific tasks (like high-memory instances for databases, or nodes with specific hardware accelerators) and ensure only the appropriate workloads land there.

A common pattern is to taint all nodes except a specific set. For example, you might have a set of general-purpose nodes and a set of specialized nodes. You’d taint the specialized nodes to repel general pods, and then add tolerations to the pods that should run on the specialized nodes. This is more robust than relying solely on node selectors or affinity rules, as taints act as a hard admission control policy.

Consider a scenario where you want to reserve certain nodes for critical system pods. You can taint all nodes with critical=true:NoExecute. Then, your critical system pods would have a toleration for critical=true:NoExecute. Any pod that isn’t critical would then be evicted from these nodes if they were already running there. This is a powerful way to ensure resource availability for your most important applications.

The NoExecute effect is particularly potent. If a node is tainted with NoExecute, any pod running on that node that does not tolerate the taint will be evicted. This is extremely useful for maintenance or when a node’s hardware is no longer suitable for a running pod. You can apply a NoExecute taint to a node, and Kubernetes will automatically drain it of incompatible pods.

The Exists operator for tolerations is useful when you just want to allow a pod onto a node with a specific taint, regardless of the taint’s value. For instance, if you have multiple types of specialized nodes, you might taint them all with a common key like workload-type. A pod that can run on any specialized node could then have a toleration like:

tolerations:
- key: "workload-type"
  operator: "Exists"
  effect: "NoSchedule"

This allows the pod to be scheduled on any node that has the workload-type key, irrespective of its specific value.

Beyond explicit taints, EKS also applies taints by default to nodes. For example, nodes managed by EKS might have taints like eks.amazonaws.com/capacityType=ON_DEMAND or eks.amazonaws.com/fargate-profile. You often don’t need to modify these, but understanding they exist helps explain why certain pods might not land on specific node groups without explicit tolerations.

The next step in fine-tuning pod placement involves understanding node affinity and anti-affinity rules, which offer more nuanced control than taints and tolerations alone.

Want structured learning?

Take the full Eks course →