The most surprising thing about circuit breakers is that they don’t actually prevent failures; they manage the impact of failures.

Let’s say you have a service that calls out to an external API. This API is generally reliable, but sometimes it gets slow or returns errors. If your service keeps hammering a failing API, it can quickly exhaust its own resources (threads, connections, memory) trying to get a response, effectively bringing your service down too.

A circuit breaker sits between your service and the external API. When your service wants to call the API, it first checks with the circuit breaker.

Here’s a simplified Go program using gobreaker:

package main

import (
	"fmt"
	"log"
	"net/http"
	"time"

	"github.com/sony/gobreaker"
)

// This is a mock external service that will sometimes fail
func callExternalService() error {
	// Simulate a 20% chance of failure
	if time.Now().Second()%5 == 0 {
		log.Println("External service is failing!")
		return fmt.Errorf("simulated external service error")
	}
	log.Println("External service succeeded!")
	return nil
}

func main() {
	// Configure the circuit breaker
	st := gobreaker.NewCircuitBreaker(gobreaker.Settings{
		Name: "external-service-breaker",
		// How many consecutive errors before opening the circuit
		MaxRequests: 3,
		// How long to wait in the Open state before attempting a half-open check
		Timeout: 10 * time.Second,
		// How many requests to allow in the Half-Open state
		ReadyToTrip: func(counts gobreaker.Counts) bool {
			return counts.ConsecutiveFailures >= 3
		},
		// Function called when the breaker opens
		OnStateChange: func(name string, from, to gobreaker.State) {
			log.Printf("Circuit breaker '%s' changed state from %s to %s\n", name, from, to)
		},
		// Function to determine if a request failed
		IsSuccessful: func(err error) bool {
			return err == nil
		},
	})

	// Simulate incoming requests to our service
	for i := 0; i < 20; i++ {
		time.Sleep(1 * time.Second) // Simulate some delay between requests

		log.Printf("Attempting request %d\n", i+1)
		err := st.Execute(func() error {
			// This is the actual call to the external service
			return callExternalService()
		})

		if err != nil {
			// This error could be from the external service OR the circuit breaker itself
			// If it's from the breaker, it means the circuit is open or half-open failed
			log.Printf("Request %d failed: %v\n", i+1, err)
		} else {
			log.Printf("Request %d succeeded.\n", i+1)
		}
	}
}

When you run this, you’ll see the callExternalService sometimes fail. After three consecutive failures, the circuit breaker will "open." While open, any subsequent calls to st.Execute will immediately return a gobreaker.ErrOpenState error without even trying to call callExternalService. This protects your service from wasting resources.

After Timeout duration (10 seconds in our example), the breaker transitions to HalfOpen. The next single request is allowed through. If this request succeeds, the breaker resets to Closed. If it fails, it immediately goes back to Open for another Timeout period. This gives the external service a chance to recover without your service constantly bombarding it.

The core problem circuit breakers solve is cascading failures. Imagine your service has 100 goroutines. If it calls an external API that starts responding with errors after 500ms, and your goroutine has a 1-second timeout, those 100 goroutines might become blocked for up to a second each, waiting for a response that never comes or is an error. If your service needs to handle thousands of requests per second, this can quickly tie up all your goroutines, making your service unresponsive. The circuit breaker, by refusing to make calls to a known-failing service, prevents your goroutines from getting stuck and allows your service to continue handling other requests that don’t depend on the broken dependency.

The gobreaker.Settings struct is where you tune its behavior.

  • MaxRequests: This is the number of requests that can be made in the HalfOpen state. The default is 1. The ReadyToTrip function is more powerful, allowing you to define the condition for opening based on consecutive failures or a percentage of failures within a window.
  • Timeout: This is the duration the breaker stays Open before transitioning to HalfOpen. This is crucial for allowing downstream systems time to recover.
  • ReadyToTrip: This function receives gobreaker.Counts which includes TotalRequests, ConsecutiveSuccesses, and ConsecutiveFailures. You can use this to open the circuit after, say, 5 consecutive failures, or after 100 requests with more than 10% failure rate.
  • OnStateChange: This is vital for observability. You must log when the breaker opens or closes. This is your primary alert that a dependency is having trouble.
  • IsSuccessful: This function tells the breaker what constitutes a success. By default, it’s just err == nil. You might want to consider HTTP status codes (e.g., 5xx are failures, 4xx are client errors and might not trigger the breaker) or specific error types.

The gobreaker.Execute method is the primary interface. It takes a function that performs the operation you want to protect. If the circuit is Closed, it calls the function. If Open, it returns gobreaker.ErrOpenState. If HalfOpen, it calls the function once.

One common pitfall is not having adequate monitoring on OnStateChange. If your circuit breaker is opening and closing frequently, and you’re not alerted to it, you’re missing a critical signal that a dependency is unstable. The breaker is working, but your operations team might be unaware of the underlying problem. You should have alerts tied to the OnStateChange logs for transitions to Open.

The next logical step after implementing basic circuit breakers is understanding how to combine them with request timeouts and retries to form a robust resilience strategy.

Want structured learning?

Take the full Circuit-breaker course →