A circuit breaker stops a failing service from bringing down everything else by acting like an electrical circuit breaker.
Let’s watch a circuit breaker in action. Imagine you have a users service that calls a profile service. If profile starts failing, users will keep hammering it, eventually exhausting its own resources and failing too. This is a cascading failure.
Here’s how we can prevent that with a circuit breaker in your application code. We’ll use a popular library, resilience4j in Java.
// Define the backend service call
Supplier<String> profileServiceCall = () -> {
// Simulate a failing call after a few successes
if (System.currentTimeMillis() % 10000 < 5000) { // Fails for 5 seconds every 10 seconds
throw new RuntimeException("Profile service unavailable");
}
return "Profile data";
};
// Configure the circuit breaker
CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
.failureRateThreshold(50) // 50% of calls fail
.waitDurationInOpenState(Duration.ofSeconds(5)) // Stay open for 5 seconds
.permittedNumberOfCallsInHalfOpenState(2) // Allow 2 calls in half-open
.slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
.slidingWindowSize(10) // Look at the last 10 calls
.recordExceptions(RuntimeException.class) // What exceptions to count as failures
.build();
CircuitBreakerRegistry circuitBreakerRegistry = CircuitBreakerRegistry.of(circuitBreakerConfig);
CircuitBreaker circuitBreaker = circuitBreakerRegistry.circuitBreaker("profileService");
// Wrap the backend call with the circuit breaker
Supplier<String> decoratedProfileServiceCall = CircuitBreaker.decorateSupplier(circuitBreaker, profileServiceCall);
// Simulate making calls
for (int i = 0; i < 20; i++) {
try {
String result = decoratedProfileServiceCall.get();
System.out.println("Call " + i + ": Success - " + result);
} catch (Exception e) {
System.out.println("Call " + i + ": Failed - " + e.getMessage() + " (State: " + circuitBreaker.getState() + ")");
}
Thread.sleep(1000); // Wait 1 second between calls
}
When you run this, you’ll see the profileService failing intermittently. Initially, the circuit breaker is CLOSED. It allows calls through. If enough calls fail (50% in this config, within the last 10 calls), the breaker trips and goes OPEN.
In the OPEN state, subsequent calls to decoratedProfileServiceCall.get() will immediately throw a CallNotPermittedException without even attempting to call the actual profileService. This is the crucial part: it protects the failing service from being overwhelmed and prevents your calling service from wasting resources.
After waitDurationInOpenState (5 seconds here), the breaker transitions to HALF-OPEN. It allows a small number of calls (permittedNumberOfCallsInHalfOpenState = 2) through. If these calls succeed, the breaker CLOSEDs. If they fail, it goes OPEN again for another 5 seconds. This prevents a transient failure from becoming permanent.
The slidingWindowSize and slidingWindowType determine how the failure rate is calculated. COUNT_BASED means the last N calls are considered. TIME_BASED means calls within the last T duration are considered.
The most surprising thing about circuit breakers is that their primary job isn’t to fix the failing service, but to shield the calling service and the overall system from its failure. By immediately rejecting requests, they prevent resource exhaustion (thread pools, connections, memory) in the caller, which would otherwise cascade. This buys time for the failing service to recover or for an operator to intervene.
The next concept you’ll run into is how to implement fallback logic when the circuit breaker is open. Instead of just failing fast, you might want to return cached data, a default value, or a simplified response.