AKS planned maintenance windows are a surprisingly robust way to control when your cluster gets updated, but most people treat them like a simple on/off switch.
Let’s see this in action. Imagine you have an AKS cluster named my-aks-cluster in resource group my-resource-group. You want to ensure that any planned maintenance, like Kubernetes version upgrades or node image updates, only happens outside of business hours. You can define this using a maintenance configuration.
First, you need to create a maintenance configuration. This is a custom resource that tells AKS how to schedule maintenance.
az maintenance configuration create \
--resource-group my-resource-group \
--name my-aks-maintenance-config \
--location eastus \
--maintenance-scope cluster \
--maintenance-window-start "2024-07-20T02:00:00Z" \
--maintenance-window-end "2024-07-20T06:00:00Z" \
--maintenance-window-recurrence "Weekly" \
--maintenance-window-day "Saturday"
Here, maintenance-scope cluster specifies that this configuration applies to the AKS cluster itself. The maintenance-window-start and maintenance-window-end define a daily window, and maintenance-window-recurrence and maintenance-window-day make it repeat weekly on Saturdays. This means AKS will try to perform maintenance within this Saturday window, but it’s not a hard guarantee for every update.
Now, you need to associate this configuration with your AKS cluster.
az aks update \
--resource-group my-resource-group \
--name my-aks-cluster \
--maintenance-configuration my-aks-maintenance-config
This command links the defined maintenance schedule to your specific AKS cluster. From this point on, AKS will consider these windows when scheduling updates.
The core problem AKS planned maintenance windows solve is the disruption caused by automatic, unannounced updates. By defining these windows, you gain predictability, allowing you to schedule updates during off-peak hours to minimize impact on your users and applications. It’s about shifting from reactive patching to proactive, controlled maintenance.
Internally, AKS uses this configuration to queue up maintenance tasks. When an update is available and a maintenance window is approaching, AKS will attempt to apply it. The scope parameter is crucial: cluster maintenance covers control plane upgrades and node image updates, while nodeTasks could be used for more granular node-level operations if needed (though cluster is more common for AKS). The recurrence pattern is key to making this a sustainable practice.
The levers you control are primarily the timing and frequency of these windows. You can define multiple configurations for different scopes or different clusters. You can also set a duration for the window, though the start and end times are more commonly used for precise control. The time-zone can also be specified if you need windows relative to a specific local time.
The most surprising thing about AKS planned maintenance windows is that they don’t guarantee maintenance will happen within the window, but rather that maintenance will not happen outside of it, unless it’s an emergency security update. AKS prioritizes critical security patches and may push them outside your defined windows if absolutely necessary. This is a crucial distinction; it’s a "do not disturb" sign, not a "please do this now" command for all types of updates.
The next step in mastering AKS updates is understanding how to manage node image upgrades separately from Kubernetes version upgrades.