Azure Kubernetes Service (AKS) clusters need to communicate with external services over the internet. By default, AKS uses the public IP address of the AKS node VMs for this egress traffic. However, for security and compliance reasons, you often need to control and monitor this egress traffic, and ensure it originates from a static, predictable IP address. This is where Azure NAT Gateway comes in.

Let’s see NAT Gateway in action. Imagine you have a stateless web application deployed on AKS. Your application needs to call a third-party API for credit card processing. If your AKS nodes’ IPs change frequently, the third-party API might flag your requests as suspicious, or you might have to constantly update their allowlist.

Here’s a basic AKS deployment that might need NAT Gateway:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-processor
spec:
  replicas: 3
  selector:
    matchLabels:
      app: payment-processor
  template:
    metadata:
      labels:
        app: payment-processor
    spec:
      containers:
      - name: processor
        image: your-docker-repo/payment-processor:v1.2.0
        ports:
        - containerPort: 8080

This deployment runs three pods of a payment processing application. These pods, running on AKS nodes, will eventually need to send requests to an external payment gateway.

When you deploy this to an AKS cluster without NAT Gateway, the egress traffic from these pods will originate from the ephemeral public IP addresses assigned to the AKS node VMs. If a node is scaled up, down, or replaced, its public IP can change. This is problematic for external services that rely on IP-based allowlists.

To fix this, we introduce Azure NAT Gateway. NAT Gateway is a fully managed Azure service that provides outbound connectivity (egress) from virtual networks to the internet. It simplifies outbound-only internet connectivity for virtual networks, whether in AKS or other Azure services.

The core idea is to associate a NAT Gateway with the subnet where your AKS nodes reside. When traffic leaves this subnet and heads to the internet, NAT Gateway intercepts it and translates the source IP address to one of its own static public IP addresses. This ensures all your AKS egress traffic appears to come from a consistent set of IPs.

Here’s how you provision and configure it:

First, you need a public IP address or a public IP prefix for your NAT Gateway. A prefix is recommended for multiple IPs.

# Create a Resource Group (if you don't have one)
az group create --name myAKSResourceGroup --location eastus

# Create a Public IP Address Prefix (e.g., /28 provides 16 IPs)
az network public-ip prefix create \
  --name myNatGatewayPrefix \
  --resource-group myAKSResourceGroup \
  --location eastus \
  --length 28

Next, create the NAT Gateway itself, associating it with the public IP prefix.

# Create the NAT Gateway
az network nat gateway create \
  --name myAKSNatGateway \
  --resource-group myAKSResourceGroup \
  --location eastus \
  --public-ip-prefix myNatGatewayPrefix \
  --idle-timeout 4 # Default is 4 minutes, often good to keep or adjust.

Now, the crucial step: associate this NAT Gateway with the subnet that your AKS nodes use. When you create an AKS cluster, you specify the VNet and subnet. You can either:

  1. Create a new VNet/subnet for AKS and associate NAT Gateway: This is the cleanest approach.
  2. Use an existing VNet/subnet: You’ll need to ensure no other services are using this subnet that might have conflicting egress requirements.

Let’s assume you are creating a new AKS cluster for this purpose. You’ll need to get the resource ID of the subnet you want to associate with the NAT Gateway.

# Get the resource ID of the subnet where AKS will be deployed
SUBNET_ID=$(az network vnet subnet show \
  --resource-group myAKSResourceGroup \
  --vnet-name myAKSNetwork \
  --name myAKSSubnet \
  --query id -o tsv)

# Associate the NAT Gateway with the subnet
az network vnet subnet update \
  --resource-group myAKSResourceGroup \
  --vnet-name myAKSNetwork \
  --name myAKSSubnet \
  --nat-gateway myAKSNatGateway

Self-correction: The az network vnet subnet update command above is the correct way to associate a NAT Gateway with an existing subnet. If you are creating a new AKS cluster and want to ensure the subnet is configured for NAT Gateway from the start, you would typically provision the VNet and subnet first, associate the NAT Gateway with that subnet, and then create the AKS cluster, pointing it to that pre-configured subnet.

When creating a new AKS cluster, you’d provide the subnet ID:

# Example AKS creation command pointing to the pre-configured subnet
az aks create \
  --resource-group myAKSResourceGroup \
  --name myAKSCluster \
  --vnet-subnet-id $SUBNET_ID \
  --node-count 1 \
  --enable-managed-identity \
  --network-plugin azure # Ensure Azure CNI is used for subnet-level control

Why this works: Azure’s networking infrastructure, when it sees traffic originating from a subnet that has a NAT Gateway attached, directs that egress traffic through the NAT Gateway. The NAT Gateway then performs Source Network Address Translation (SNAT), replacing the private IP address of the AKS node (or pod, if using Azure CNI with private IPs for pods) with one of its own public IP addresses. This ensures that all outbound connections from within that subnet appear to come from the static IPs of the NAT Gateway.

The idle-timeout parameter is important. It defines how long a TCP connection can remain idle before NAT Gateway closes it. The default is 4 minutes (240 seconds). If your applications have long-lived idle connections (e.g., long polling, certain database connections), you might need to increase this value. For example, to set it to 10 minutes:

az network nat gateway update \
  --name myAKSNatGateway \
  --resource-group myAKSResourceGroup \
  --idle-timeout 600

This change would be applied to the NAT Gateway resource, and all subnets associated with it would then use the new idle timeout.

The most common issue people run into is forgetting to associate the NAT Gateway with the correct subnet. If your AKS nodes are in a different subnet than the one you attached the NAT Gateway to, egress traffic will not be translated. Always double-check the subnet ID used by your AKS node pool.

The next challenge you’ll likely face is managing inbound traffic. While NAT Gateway handles egress, you’ll still need an Ingress controller or LoadBalancer service to expose your applications to external clients.

Want structured learning?

Take the full Aks course →