Cilium’s Cluster Pool IPAM feature is designed to solve the problem of IP address exhaustion in large Kubernetes clusters by allowing you to pre-allocate and manage IP address ranges for your pods across multiple nodes.
Let’s see it in action. Imagine you have a cluster with multiple nodes, and you’re running out of IPs for your pods. Instead of relying on each node to dynamically request IPs from a central pool, you can configure Cilium to manage a larger pool of IPs and distribute them.
Here’s a simplified view of how it works. When a pod needs an IP, Cilium, configured with Cluster Pool IPAM, draws from a pre-defined CIDR that spans across your cluster. This CIDR is broken down into smaller chunks, and these chunks are assigned to individual nodes.
Consider a scenario where you have a ClusterPoolConfig resource defined.
apiVersion: cilium.io/v2alpha1
kind: ClusterPoolConfig
metadata:
name: cluster-pool-config
spec:
# This defines the overall CIDR block for pod IPs across the cluster.
# In a real-world scenario, this would be a much larger block, e.g., 10.0.0.0/16.
clusterPoolIPv4PodCIDR: 192.168.0.0/20
# This is the size of the subnet that will be allocated to each node.
# A /24 means each node can host up to 254 pods (256 - 2 for network/broadcast).
allocator:
type: ipam
clusterPoolIPAM:
# This determines how many /24 subnets are available for allocation to nodes.
# For a /20 cluster pool CIDR, you can have 16 /24 subnets.
# If this is set to 16, and you have 16 nodes, each node gets a /24.
# If you have more than 16 nodes, some nodes will need to share or get smaller subnets.
numBigHoleIPAMs: 16
When you apply this configuration, Cilium starts managing the IP addresses within 192.168.0.0/20. It carves out 16 subnets of /24 size from this pool. These /24 subnets are then distributed to your nodes. For example, the first node might get 192.168.0.0/24, the second 192.168.1.0/24, and so on, up to 192.168.15.0/24.
The numBigHoleIPAMs field is crucial. It dictates the number of larger subnets (in this case, /24s) that Cilium will pre-allocate from the clusterPoolIPv4PodCIDR. If you have more nodes than numBigHoleIPAMs, some nodes will not receive their own /24 and will need to share from the already allocated ones. This is where the "big hole" concept comes into play – it refers to the initial large subnets that are distributed.
The actual IP allocation for pods happens from these node-specific subnets. So, if a pod is scheduled on Node A, which received 192.168.0.0/24, its IP will be something like 192.168.0.X.
This approach offers predictable IP allocation and allows for much larger clusters than traditional per-node IPAM. It’s particularly useful when you have a fixed number of nodes and want to ensure each has a dedicated block of IPs for its pods, preventing scenarios where a single node’s IP demand could deplete the cluster’s available IPs.
The most surprising thing about Cluster Pool IPAM is that even though you’re defining a large cluster-wide CIDR, the actual IP assignment to pods is still granular and happens at the node level, drawing from subnets that have been pre-allocated to that specific node. This hybrid approach is key to its scalability.
The next step you’ll likely encounter is managing the lifecycle of these ClusterPoolConfig resources, especially when scaling your cluster up or down, and understanding how IP address reuse is handled.