Time-based circuit breaker windows measure failure rates by dividing a recent, fixed-duration time window into smaller, contiguous sub-windows, and then calculating the proportion of failed requests within the most recent sub-window relative to the total requests in that same sub-window.
Let’s watch this in action. Imagine a service frontend that calls a service user-profile.
{
"frontend": {
"circuit_breakers": [
{
"name": "user-profile",
"type": "time_window",
"interval": "10s",
"window": "60s",
"threshold": 0.5,
"interval_count": 10
}
]
}
}
Here, the window is 60 seconds. This 60-second window is divided into interval_count (10) smaller intervals, each lasting 6 seconds (60s / 10 = 6s).
When frontend calls user-profile, each request is timestamped and assigned to one of these 10-second intervals within the overall 60-second window. The circuit breaker doesn’t care about which 6-second interval the request fell into, only that it happened within the last 60 seconds.
Let’s say the frontend makes 100 requests to user-profile over a 60-second period. The circuit breaker is interested in the most recent 6-second interval. If, within that most recent 6-second interval, 7 of those 100 requests failed, the failure rate for that interval is 7/10 = 0.7.
The threshold is set to 0.5 (50%). Since 0.7 is greater than 0.5, the circuit breaker trips, and subsequent calls to user-profile from frontend will fail immediately without even attempting to reach user-profile.
The beauty here is that the breaker is always looking at a fresh, fixed-size slice of recent history. As new requests come in, older ones simply fall out of the 60-second window. It’s a continuous, rolling calculation.
The interval_count of 10 means the 60-second window is broken into 10 discrete buckets, each 6 seconds long. If a request arrives at second 55 of the 60-second window, it lands in the bucket for seconds 54-60. If another request arrives at second 61, it lands in a new bucket, and the bucket for seconds 0-6 is now considered "old" and is no longer part of the calculation for the current failure rate.
This "time window" approach is distinct from "counter-based" circuit breakers. Counter-based breakers might look at the last N requests regardless of how long ago they happened. Time-based breakers are more sensitive to recent performance degradation. If a service suddenly starts failing, the time-based breaker will notice that spike in failure rate within its fixed window much faster than a counter-based breaker that might still be processing many successful requests from earlier.
The interval field (10s in the example) is actually a bit of a misnomer in this context. It’s not the duration of the sub-window. Instead, it often signifies the minimum duration between checks or the reporting interval for metrics. The actual sub-window duration is derived from window and interval_count. In our example, window is 60s and interval_count is 10, so the actual sub-window duration is 60s / 10 = 6s. The interval: "10s" here doesn’t dictate the sub-window size, but rather how often the breaker might re-evaluate or report its state, and it dictates the minimum duration for each sub-window. If interval were 5s, and window 60s, you might have 12 sub-windows. The interval acts as a lower bound for the sub-window size.
The threshold is the critical lever. A lower threshold means the circuit breaker is more sensitive and will trip faster with fewer failures. A higher threshold means it will tolerate more failures before opening. The choice of window and interval_count determines the granularity and recency of the data used for the failure rate calculation. A shorter window makes the breaker more reactive to sudden bursts of errors, while a longer window smooths out transient glitches.
What most people miss is that the interval value in the configuration doesn’t directly define the duration of the sub-windows used for calculating the failure rate. The sub-window duration is derived from window / interval_count. The interval field’s primary role is often related to the reporting frequency of metrics or the minimum duration of each sub-interval, ensuring that even if window / interval_count results in a very small number, the effective sub-interval is at least interval. However, the calculation of the failure rate itself is based on the number of failures in the most recent sub-window, determined by window and interval_count.
The next concept to explore is how these circuit breakers interact with retries.