Datadog downtime is how you tell the system to ignore incoming alerts for a specific period, preventing your team from being pestered by notifications during planned maintenance or other events where alerts are expected but not actionable.
Here’s how to set it up and see it in action.
Let’s say you’ve got a critical database upgrade scheduled for Saturday night, from 10 PM to 2 AM Eastern Time. You know that during this window, your database will be offline, and Datadog will trigger alerts for high query latency, connection errors, and general unavailability. You don’t want these alerts flooding Slack and creating unnecessary noise.
First, you’d navigate to the "Alerts" section in Datadog, and then select "Downtime."

Here’s what you’d fill in:
- Scope: This is crucial. You need to tell Datadog which monitors this downtime applies to. You can use tags. For example, if your database servers have the tag
service:postgresandenv:production, you’d enter that. Or, if you want to silence all alerts for a specific host, you’d use thehost:tag. For this example, let’s say your database cluster has the tagdb_cluster:prod-mainand you want to silence alerts for anything tagged with that.- Scope:
db_cluster:prod-main
- Scope:
- Schedule: This defines when the downtime starts and ends.
- Start Time: 2023-10-28 22:00:00 EST (This is October 28th, 2023, 10 PM Eastern)
- End Time: 2023-10-29 02:00:00 EST (This is October 29th, 2023, 2 AM Eastern)
- Recurrence: For a one-off maintenance, you’d select "No recurrence." If this was a recurring weekly restart, you might choose "Weekly" and specify the day and time.
- Message (Optional but Recommended): A brief explanation of why the downtime is active. This helps anyone looking at the downtime list understand the context.
- Message: "Scheduled maintenance for prod-main database cluster upgrade. Expect temporary unavailability."
- Mute Events (Optional): You can also choose to mute specific event streams during this downtime, like deployment events or custom events. For this scenario, we’ll leave this blank.
Once you click "Create Downtime," Datadog will immediately stop evaluating alerts that match your specified scope against the defined schedule. You’ll see an entry in the Downtime list indicating that this period is active.
Let’s imagine you’ve got a monitor set up for avg:system.cpu.user{host:webserver-01} by {host} > 90% for 5 minutes. If webserver-01 is included in your downtime scope, and its CPU spikes to 95% during the maintenance window, this monitor will not trigger an alert. Datadog simply won’t send it to the alerting engine because the downtime is active.
The core problem downtime solves is alert fatigue and noise during planned, non-actionable events. Without it, every system that goes down or misbehaves during maintenance would generate an alert, potentially burying important, real-time issues or just annoying everyone on call.
Internally, when a monitor’s conditions are met, Datadog first checks if any active downtimes apply to that monitor’s scope. If a downtime is found that covers the specific host, tag, or monitor group the alert originates from, and the alert falls within the downtime’s active time window, the alert is suppressed. It’s like a blanket placed over the alert’s path to the notification channels.
The scope is the most powerful and often misunderstood part. You can scope downtime by:
- Host:
host:my-server-01 - Tags:
service:redis,region:us-east-1(you can combine these with&&for AND logic, e.g.,service:redis && region:us-east-1) - Monitor Name/ID: If you want to silence a specific monitor regardless of what it’s monitoring.
- Monitor Group: If a monitor alerts on multiple things (e.g., CPU for all web servers), you can silence alerts for a specific group (e.g.,
host:webserver-05).
The real magic of downtime is its ability to be applied not just to individual hosts or alerts, but to entire logical groups of resources defined by tags. This means you can silence all alerts for a specific application stack (app:frontend && env:staging) or a particular data center (datacenter:rack-3) with a single downtime entry, significantly reducing the management overhead during large-scale maintenance or deployments. Many teams overlook the power of combining multiple tags with && to create very precise silencing windows that cover complex infrastructure dependencies.
After your maintenance is complete and your systems are back online, you’ll want to ensure your downtime entry is either deleted or has expired. If you forget, you might find yourself missing actual issues later on. The next thing you’ll likely run into is needing to manage recurring downtimes for predictable, regular maintenance, like weekly server reboots or nightly batch job failures.