Azure Monitor Alerts and Action Groups are how you get notified when something goes wrong in Azure, and crucially, what to do about it.

Let’s see an alert in action. Imagine a web app that’s suddenly become unavailable.

{
  "schemaId": "AzureMonitorCommonAlertRule",
  "properties": {
    "name": "High CPU Usage on WebApp",
    "description": "Alerts when CPU percentage on the web app exceeds 80% for 5 minutes.",
    "severity": "Sev1",
    "enabled": true,
    "evaluationFrequency": "00:05:00",
    "windowSize": "00:05:00",
    "condition": {
      "odata.type": "Microsoft.Azure.Management.Alerts.Models.ThresholdCondition",
      "dataSource": {
        "odata.type": "Microsoft.Azure.Management.Alerts.Models.RuleMetricDataSource",
        "resourceUri": "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/myResourceGroup/providers/Microsoft.Web/sites/myWebApp",
        "metricName": "Percentage CPU"
      },
      "operator": "GreaterThan",
      "threshold": 80,
      "criterionType": "Static"
    },
    "action": {
      "actionGroups": [
        "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/myResourceGroup/providers/microsoft.insights/actionGroups/myActionGroup"
      ]
    }
  }
}

This JSON defines a rule that monitors the "Percentage CPU" metric for a specific web app. If the CPU usage stays above 80% for 5 minutes, it triggers. The actionGroups array points to a pre-configured ActionGroup resource, which dictates what happens next.

The core problem Azure Monitor Alerts and Action Groups solve is the gap between detection (something is wrong) and response (do something about it). Without them, you’re either constantly polling metrics manually or relying on users to report issues, which is too late.

Here’s how it breaks down internally:

  1. Metrics Collection: Azure services emit performance counters and logs. Azure Monitor collects these, storing them in a time-series database.
  2. Alert Rule Evaluation: Alert rules are essentially continuous queries against this metric data. For our web app example, Azure Monitor is constantly checking the Percentage CPU metric for myWebApp.
  3. Condition Met: When the Percentage CPU for myWebApp crosses the 80 threshold and stays there for the windowSize of 5 minutes, the alert rule’s condition is met.
  4. Action Group Triggered: The alert rule then fires off a request to the specified ActionGroup.
  5. Action Group Execution: The ActionGroup is a collection of actions. These can be sending an email, triggering an SMS, calling a webhook, running an Azure Function, or creating a ticket in ITSM tools.

The ActionGroup is the real hero here, turning a notification into a potential automated fix or a clear instruction for a human.

Let’s look at a sample ActionGroup definition:

{
  "schemaId": "Microsoft.Insights/actionGroups",
  "properties": {
    "name": "myActionGroup",
    "enabled": true,
    "groupShortName": "myAG",
    "actions": [
      {
        "actionType": "EmailReceiver",
        "emailReceiver": {
          "name": "ITSupport",
          "emailAddress": "support@example.com",
          "useCommonAlertSchema": true
        }
      },
      {
        "actionType": "SmsReceiver",
        "smsReceiver": {
          "name": "OnCall",
          "countryCode": "1",
          "phoneNumber": "555-123-4567",
          "useCommonAlertSchema": true
        }
      },
      {
        "actionType": "WebhookReceiver",
        "webhookReceiver": {
          "name": "WebhookToAutomation",
          "serviceUri": "https://my-automation-webhook.azurewebsites.net/api/processAlert?code=YOUR_WEBHOOK_KEY",
          "useCommonAlertSchema": true
        }
      }
    ]
  }
}

This ActionGroup defines three distinct actions: send an email to support@example.com, send an SMS to 555-123-4567, and call a webhook at https://my-automation-webhook.azurewebsites.net/api/processAlert. The useCommonAlertSchema: true is important; it standardizes the payload sent to receivers, making it easier to process consistently.

The most surprising thing about Action Groups is how they abstract away the notification mechanism from the event. You can have a single Action Group that notifies multiple teams via different channels, and if you need to change the notification list or add a new channel (like a Slack integration via a webhook), you only update the Action Group, not every single alert rule that uses it. This is a massive win for operational hygiene.

Consider the WebhookReceiver action. When the alert fires, Azure Monitor will POST a JSON payload to the serviceUri. This payload contains all the details of the alert: the rule name, severity, affected resource, time of firing, and current metric values. This allows for sophisticated automated remediation. For example, your webhook endpoint could parse this payload, see it’s a "High CPU" alert on a web app, and trigger an Azure Automation runbook to restart the web app instance or scale out the app service plan.

The next thing you’ll want to explore is how to use the Azure Function action type within an Action Group, which provides even more programmatic control over your responses.

Want structured learning?

Take the full Azure course →