LFU eviction is fundamentally a lie about what "most used" actually means.
Let’s see how it works with a simple cache. Imagine we have a cache that can hold 3 items, and we’re using LFU.
from collections import Counter
import heapq
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.cache = {} # key -> value
self.counts = Counter() # key -> access count
self.freq_map = {} # count -> set of keys with that count
self.min_freq = 0
def _update_freq(self, key):
count = self.counts[key]
self.counts[key] += 1
new_count = self.counts[key]
self.freq_map[count].remove(key)
if not self.freq_map[count]:
del self.freq_map[count]
if count == self.min_freq:
self.min_freq += 1
if new_count not in self.freq_map:
self.freq_map[new_count] = set()
self.freq_map[new_count].add(key)
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self._update_freq(key)
return self.cache[key]
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
if key in self.cache:
self.cache[key] = value
self._update_freq(key)
return
if len(self.cache) >= self.capacity:
# Evict LFU item
evict_key = next(iter(self.freq_map[self.min_freq]))
self.freq_map[self.min_freq].remove(evict_key)
if not self.freq_map[self.min_freq]:
del self.freq_map[self.min_freq]
del self.cache[evict_key]
del self.counts[evict_key]
self.cache[key] = value
self.counts[key] = 1
if 1 not in self.freq_map:
self.freq_map[1] = set()
self.freq_map[1].add(key)
self.min_freq = 1
# Example Usage
cache = LFUCache(2)
cache.put(1, 1)
cache.put(2, 2)
print(cache.get(1)) # returns 1, count[1]=2, min_freq=1
cache.put(3, 3) # evicts key 2 (LFU, min_freq). count[1]=2, count[3]=1, min_freq=1
print(cache.get(2)) # returns -1 (not found)
print(cache.get(3)) # returns 3, count[1]=2, count[3]=2, min_freq=2
cache.put(4, 4) # evicts key 1 (LFU, min_freq). count[3]=2, count[4]=1, min_freq=1
print(cache.get(1)) # returns -1
print(cache.get(3)) # returns 3, count[3]=3, count[4]=1, min_freq=1
print(cache.get(4)) # returns 4, count[3]=3, count[4]=2, min_freq=2
The core problem LFU tries to solve is that simple LRU (Least Recently Used) can prematurely evict an item that’s consistently used, but not recently used. If an item is accessed 100 times in a row and then not for a while, LRU might ditch it. LFU aims to keep items with high access frequency in the cache.
LFU typically uses a frequency counter for each item. When an item is accessed, its counter increments. When the cache is full and needs to evict, it looks for the item with the lowest frequency count. If there’s a tie in frequency, it usually falls back to LRU among those tied items. This is where the "lie" comes in: it’s not purely LFU; it’s LFU with an LRU tie-breaker.
Internally, this often involves a map where keys are frequencies and values are lists (or sets) of items with that frequency. A separate counter tracks the minimum frequency currently present. When an item’s frequency increases, it moves from its old frequency list to the new one. If the item being evicted is the last item in the minimum frequency list, the min_freq pointer needs to be updated.
The surprising part is how LFU handles ties. Most implementations don’t just pick any item with the minimum frequency; they apply a secondary eviction policy, usually LRU, to break the tie. This means an item that has been accessed many times but not recently might still get evicted if there’s another item with the same low frequency that was accessed even less recently.
Consider a cache with capacity 2.
put(1, 1): Cache:{1:1}, Counts:{1:1}, FreqMap:{1: {1}}, min_freq: 1put(2, 2): Cache:{1:1, 2:2}, Counts:{1:1, 2:1}, FreqMap:{1: {1, 2}}, min_freq: 1get(1): Cache:{1:1, 2:2}, Counts:{1:2, 2:1}, FreqMap:{1: {2}, 2: {1}}, min_freq: 1. Item 1 is now more frequent.put(3, 3): Cache is full. Evict item with min_freq (1). Both 1 and 2 have count 1, but 2 was accessed less recently than 1 (which was just accessed). So, 2 is evicted. Cache:{1:1, 3:3}, Counts:{1:2, 3:1}, FreqMap:{1: {3}, 2: {1}}, min_freq: 1.get(2): Returns -1.get(3): Returns 3. Counts:{1:2, 3:2}, FreqMap:{2: {1, 3}}, min_freq: 2.put(4, 4): Cache is full. Evict item with min_freq (2). Both 1 and 3 have count 2. Item 1 was accessed less recently than item 3. So, 1 is evicted. Cache:{3:3, 4:4}, Counts:{3:2, 4:1}, FreqMap:{1: {4}, 2: {3}}, min_freq: 1.
The next problem you’ll encounter is understanding how different LFU implementations handle the LRU tie-breaking, and the performance implications of that choice.