Caching in Microservices: Patterns for Distributed Architectures (2026)

Caching in microservices is often treated as a simple performance optimization, but its real power lies in its ability to decouple services and manage state in a highly distributed system.

Let’s look at a common scenario: a ProductService that fetches product details from a database, and an OrderService that frequently needs these details to construct an order. Without caching, the OrderService would hammer the ProductService (and by extension, the database) for every order placed.

// ProductService (simplified)
@GetMapping("/products/{id}")
public Product getProduct(@PathVariable String id) {
    // Imagine a complex database lookup here
    return database.findProductById(id);
}

// OrderService (simplified)
@PostMapping("/orders")
public Order createOrder(@RequestBody OrderRequest request) {
    // For each item in the order, we need product details
    List<OrderItem> orderItems = new ArrayList<>();
    for (String productId : request.getProductIds()) {
        Product product = productService.getProduct(productId); // Network call!
        orderItems.add(new OrderItem(product, request.getQuantity(productId)));
    }
    // ... create and save order
    return orderService.save(new Order(orderItems));
}

Here, productService.getProduct(productId) is a synchronous network call. If the ProductService is slow or unavailable, the OrderService grinds to a halt. This is where caching steps in, not just to speed things up, but to make the OrderService resilient to ProductService issues.

The most basic pattern is Local In-Memory Caching. Each instance of the OrderService maintains its own cache.

// OrderService with local cache
@Service
public class OrderService {

    @Autowired
    private ProductServiceClient productService; // RestTemplate or FeignClient

    private final Map<String, Product> productCache = new ConcurrentHashMap<>();

    public Order createOrder(OrderRequest request) {
        List<OrderItem> orderItems = new ArrayList<>();
        for (String productId : request.getProductIds()) {
            Product product = productCache.computeIfAbsent(productId, id -> productService.getProduct(id));
            orderItems.add(new OrderItem(product, request.getQuantity(productId)));
        }
        // ...
        return orderService.save(new Order(orderItems));
    }
}

This works because computeIfAbsent only calls productService.getProduct(id) if the productId is not already in the productCache. This drastically reduces calls to the ProductService.

However, local caches have a major drawback: cache invalidation. If a product’s price changes in the ProductService, other instances of OrderService might still serve stale data from their local caches. To address this, we move to Distributed Caching.

A common choice is a Read-Through Cache. Here, the cache itself is responsible for fetching data from the source if it’s not present. Redis or Memcached are typical for this.

// OrderService using a distributed cache (e.g., Spring Data Redis)
@Service
public class OrderService {

    @Autowired
    private ProductServiceClient productService;
    @Autowired
    private RedisTemplate<String, Product> productRedisTemplate;

    public Order createOrder(OrderRequest request) {
        List<OrderItem> orderItems = new ArrayList<>();
        for (String productId : request.getProductIds()) {
            Product product = getProductFromCacheOrSource(productId);
            orderItems.add(new OrderItem(product, request.getQuantity(productId)));
        }
        // ...
        return orderService.save(new Order(orderItems));
    }

    private Product getProductFromCacheOrSource(String productId) {
        Product product = productRedisTemplate.opsForValue().get(productId);
        if (product == null) {
            product = productService.getProduct(productId);
            // Cache it for 1 hour
            productRedisTemplate.opsForValue().set(productId, product, Duration.ofHours(1));
        }
        return product;
    }
}

In this getProductFromCacheOrSource method, we first try to get the Product from Redis. If it’s not there (a cache miss), we fetch it from the ProductService, and then store it in Redis for future requests. The Duration.ofHours(1) is the Time-To-Live (TTL) for the cached item.

The real challenge with distributed caches is consistency. When a product price changes, how do we update or invalidate the cache? This leads to the Write-Through and Write-Behind patterns.

In Write-Through, writes go to the cache and the database simultaneously. The cache is always up-to-date.

// ProductService with Write-Through Cache
@Service
public class ProductService {

    @Autowired
    private ProductRepository productRepository; // JPA repository
    @Autowired
    private RedisTemplate<String, Product> productRedisTemplate;

    public Product updateProduct(String id, Product updatedProduct) {
        // 1. Update the database
        Product savedProduct = productRepository.save(updatedProduct);

        // 2. Update the cache
        productRedisTemplate.opsForValue().set(id, savedProduct, Duration.ofHours(1)); // Refresh TTL

        return savedProduct;
    }
}

This ensures that any subsequent read from the cache will get the latest data. However, it adds latency to write operations.

Write-Behind (or Write-Around) is an optimization. Writes go to the cache first, and the cache asynchronously updates the database. This is faster for writes but introduces a window where the database might be out of sync with the cache.

// ProductService with Write-Behind (simplified - often requires message queues)
@Service
public class ProductService {

    @Autowired
    private ProductRepository productRepository;
    @Autowired
    private RedisTemplate<String, Product> productRedisTemplate;
    @Autowired
    private MessageQueueSender messageQueueSender; // e.g., Kafka or RabbitMQ

    public Product updateProduct(String id, Product updatedProduct) {
        // 1. Update the cache immediately
        productRedisTemplate.opsForValue().set(id, updatedProduct, Duration.ofHours(1));

        // 2. Send a message to asynchronously update the database
        messageQueueSender.send("product-update", id, updatedProduct);

        return updatedProduct; // Return immediately
    }

    // Listener for the message queue
    public void handleProductUpdateMessage(String id, Product product) {
        productRepository.save(product); // Update database asynchronously
    }
}

This pattern is complex to implement correctly, especially handling failures in the asynchronous update.

A more advanced pattern is Cache-Aside. This is what we implicitly built with the Read-Through example. The application code is responsible for checking the cache first, and if it’s a miss, fetching from the data source and populating the cache. The application code also needs to handle cache invalidation when data changes.

// ProductService with Cache-Aside invalidation
@Service
public class ProductService {

    @Autowired
    private ProductRepository productRepository;
    @Autowired
    private RedisTemplate<String, Product> productRedisTemplate;

    public Product updateProduct(String id, Product updatedProduct) {
        // 1. Update the database
        Product savedProduct = productRepository.save(updatedProduct);

        // 2. Explicitly invalidate the cache entry
        productRedisTemplate.delete(id);

        return savedProduct;
    }

    // Read-Through logic (as shown before)
    public Product getProduct(String id) {
        Product product = productRedisTemplate.opsForValue().get(id);
        if (product == null) {
            product = productRepository.findById(id).orElseThrow(...);
            productRedisTemplate.opsForValue().set(id, product, Duration.ofHours(1));
        }
        return product;
    }
}

When updateProduct is called, it updates the database and then explicitly deletes the corresponding key from Redis. The next time getProduct is called for that id, it will be a cache miss, forcing a fetch from the database and repopulating the cache.

The most surprising truth about distributed caching is that it’s not just about speed; it’s a fundamental mechanism for managing eventual consistency across services. When a service updates its data, other services that depend on it will eventually see the new data through their cache refreshes, rather than requiring immediate, synchronous updates. This allows for higher availability and better performance under load.

The next concept you’ll grapple with is cache distribution strategies like sharding and replication, and how they impact latency and fault tolerance.