Rust’s failsafe crate is a surprisingly elegant way to implement circuit breakers, but its real power comes from understanding that it’s not just about preventing cascading failures; it’s about actively managing latency budgets across your services.

Let’s see it in action. Imagine a simple HTTP client that might hit a slow downstream service.

use failsafe::{CircuitBreaker, Options};
use reqwest::Client;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure the circuit breaker:
    // - 10 failures allowed before opening
    // - Reset after 30 seconds
    // - Timeout for requests: 5 seconds
    let options = Options::new()
        .max_failures(10)
        .half_open_after(Duration::from_secs(30))
        .timeout(Duration::from_secs(5));

    let breaker = CircuitBreaker::new(options);
    let client = Client::new();

    let url = "http://localhost:8080/slow_service"; // Replace with your slow service URL

    for i in 0..20 {
        let breaker_clone = breaker.clone();
        let client_clone = client.clone();

        tokio::spawn(async move {
            match breaker_clone.execute(move || async move {
                println!("Attempting request {}...", i);
                let response = client_clone.get(url).send().await;
                match response {
                    Ok(res) => {
                        if res.status().is_success() {
                            println!("Request {} successful!", i);
                            Ok(res.text().await.unwrap_or_default())
                        } else {
                            eprintln!("Request {} failed with status: {:?}", i, res.status());
                            Err("HTTP Error")
                        }
                    }
                    Err(e) => {
                        eprintln!("Request {} failed: {}", i, e);
                        Err("Network Error")
                    }
                }
            }).await {
                Ok(body) => println!("Response for request {}: {}", i, body),
                Err(failsafe::Error::Rejected) => println!("Request {} rejected by circuit breaker.", i),
                Err(failsafe::Error::Timeout) => eprintln!("Request {} timed out.", i),
                Err(e) => eprintln!("Request {} encountered an unexpected error: {:?}", i, e),
            }
        });
    }

    // Keep the main thread alive for a bit to see the output
    tokio::time::sleep(Duration::from_secs(60)).await;

    Ok(())
}

This code defines a circuit breaker that will open after 10 consecutive failures and attempt to reset after 30 seconds. The execute method wraps a future. If the future returns an Err or takes longer than the configured timeout, it counts as a failure. Once the breaker opens, subsequent calls to execute will immediately return Err(failsafe::Error::Rejected) without even attempting to run the wrapped future. After the half_open_after duration, the breaker will transition to a "half-open" state, allowing a single request. If that request succeeds, the breaker closes; otherwise, it opens again.

The core problem circuit breakers solve is preventing a single degraded or failing service from bringing down the entire system through a cascade of timeouts and retries. When a service becomes slow, clients attempting to access it will start to time out. If these clients also have retry mechanisms, the failing service can become overwhelmed with requests, making it even slower or completely unresponsive. A circuit breaker interrupts this cycle by quickly failing requests to the unhealthy service, giving it a chance to recover and preventing the overload from spreading.

Internally, failsafe maintains a state machine: Closed, Open, and HalfOpen. In the Closed state, requests are passed through. Failures increment a counter. If the counter reaches max_failures, the breaker transitions to Open. In the Open state, all requests are immediately rejected until a timer (half_open_after) expires. Upon expiration, the breaker moves to HalfOpen, allowing a single request. If this request succeeds, the breaker returns to Closed. If it fails, it returns to Open, resetting the timer. This state machine is fundamental to how it manages risk.

The timeout option in failsafe is crucial. It’s not just about how long your code waits, but how long the circuit breaker will wait before considering the wrapped operation a failure. This is distinct from the underlying HTTP client’s timeout, though they should be coordinated. If the wrapped future (e.g., an HTTP request) takes longer than this timeout duration, failsafe::Error::Timeout is returned, and it contributes to the failure count. This allows you to enforce strict latency budgets at the circuit breaker level, acting as an additional layer of defense.

What most people miss is that the execute method’s closure can return any Result. The failsafe crate only cares if the Result is Err or if the operation times out. This means you can wrap not just network calls, but also complex, potentially blocking computations or any operation that could reasonably fail or exceed an acceptable latency budget. The Result type’s Err variant is the universal signal for "something went wrong," and failsafe treats it as such.

The next logical step is to explore how to integrate failsafe with distributed tracing, allowing you to visualize circuit breaker states and their impact on request latency across your microservices.

Want structured learning?

Take the full Circuit-breaker course →