Replace Cluster Autoscaler with Karpenter for Faster Node Scaling (2026)

Karpenter can provision nodes faster than Cluster Autoscaler because it directly watches for unschedulable pods and launches nodes without waiting for the Cluster Autoscaler’s consolidation logic.

Let’s see Karpenter in action. Imagine you have a Kubernetes cluster and you deploy a new application that requires more resources than currently available.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: high-cpu-app
spec:
  replicas: 5
  selector:
    matchLabels:
      app: high-cpu
  template:
    metadata:
      labels:
        app: high-cpu
    spec:
      containers:
      - name: cpu-intensive
        image: ubuntu:latest
        command: ["/bin/bash", "-c", "while true; do dd if=/dev/zero of=/dev/null bs=1G count=1; done"]
        resources:
          requests:
            cpu: "2"
          limits:
            cpu: "4"

When this deployment is applied, Kubernetes’ scheduler will try to find nodes that can accommodate these 5 pods, each requesting 2 CPUs. If no existing nodes have enough capacity, these pods will enter a Pending state.

Normally, Cluster Autoscaler would detect these Pending pods, evaluate if a new node is needed based on its configured thresholds, and then signal a cloud provider to launch an instance. This process can take several minutes, as it involves multiple steps: detecting unschedulable pods, deciding to scale up, requesting a new instance, waiting for the instance to boot and register with the cluster, and finally scheduling the pods.

Karpenter, however, operates differently. It directly watches the Kubernetes API for Pending pods. When it sees our high-cpu-app pods stuck in Pending because of insufficient CPU, it immediately analyzes the pod’s requirements (2 CPUs, 4 CPUs limit) and the available instance types in your cloud provider. Karpenter doesn’t have a "consolidation" phase like Cluster Autoscaler; it’s purely focused on fulfilling the immediate resource requests.

Instead of waiting for a separate controller, Karpenter makes a direct decision. It might determine that a m5.xlarge EC2 instance (which typically offers 4 vCPUs and 16GiB RAM) is a suitable match. It then uses the cloud provider’s API to provision this instance.

Here’s a simplified look at Karpenter’s decision-making process for our high-cpu-app pods:

Pod Watch: Karpenter’s controller is constantly watching for Pending pods.
Unschedulable Pod Detection: It identifies the 5 high-cpu-app pods that cannot be scheduled due to insufficient CPU.
Requirement Analysis: It parses the requests.cpu (2) and limits.cpu (4) for each pod.

Instance Type Matching: Karpenter consults its configured Provisioner (which defines constraints like allowed instance types, regions, and Kubernetes versions) and matches the pod requirements against available instance types. It might have a Provisioner like this:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  cluster:
    name: my-eks-cluster
  requirements:
    - key: "kubernetes.io/arch"
      operator: In
      values: ["amd64"]
    - key: "kubernetes.io/os"
      operator: In
      values: ["linux"]
    - key: "karpenter.sh/capacity-type"
      operator: In
      values: ["on-demand", "spot"]
    - key: "topology.kubernetes.io/zone"
      operator: In
      values: ["us-east-1a", "us-east-1b"]
  limits:
    cpu: 100
  provider:
    aws:
      instanceTypes:
        - m5.xlarge
        - c5.xlarge
        - r5.xlarge
      subnetSelector:
        karpenter.sh/discovery: "my-eks-cluster"
      tags:
        "karpenter.sh/provisioner-name": "default"

Node Launch: Based on the Provisioner and the pod’s needs, Karpenter decides to launch an m5.xlarge instance (4 vCPUs). It makes an API call to AWS to provision this instance.
Instance Registration: The new m5.xlarge instance boots up, joins the EKS cluster as a Node, and becomes ready.
Pod Scheduling: Kubernetes then schedules the high-cpu-app pods onto this new node.

This entire process, from pod Pending to pod Running, can happen in as little as 60-90 seconds with Karpenter, significantly faster than the typical 5-10 minutes for Cluster Autoscaler. The key difference is Karpenter’s direct, event-driven approach to pod scheduling and its ability to immediately select and launch the most appropriate instance type without the intermediate steps of Cluster Autoscaler’s group-based scaling and consolidation logic.

The mental model for Karpenter is that it’s a "pod-driven" autoscaler. It doesn’t think in terms of "node groups" or "instance fleets"; it thinks in terms of "what node do these pending pods need right now?" It constantly evaluates the set of unschedulable pods and the available instance types to find the optimal match, then launches that single instance. If more pods become unschedulable, it repeats the process for each new requirement.

A common misconception is that Karpenter always picks the smallest possible node. While it tries to be efficient, Karpenter’s primary goal is to satisfy the pod’s requirements as quickly as possible. If a pod requests 10 CPUs, and c5.2xlarge (8 vCPUs) and c5.4xlarge (16 vCPUs) are available, Karpenter will likely choose the c5.4xlarge to ensure the pod can be scheduled immediately, even if it means over-provisioning slightly. It prioritizes speed and schedulability over perfect bin-packing. This is managed by the Provisioner’s provider.aws.instanceTypes and the karpenter.sh/provisioner-name label it attaches to nodes it launches.

The next problem you’ll likely encounter is managing the lifecycle of Karpenter itself, specifically its own deployment and permissions within the cluster.