LFU Cache: Least Frequently Used Implementation

Least Frequently Used (LFU) caching takes a fundamentally different approach than its more popular cousin, LRU. While LRU evicts the item that hasn't been accessed for the longest time, LFU evicts...

Key Insights

  • LFU cache evicts the least frequently accessed item, making it ideal for workloads with stable popularity patterns like CDN caching and database query results
  • Achieving O(1) time complexity for both get and put operations requires a clever combination of hash maps and frequency-bucketed doubly linked lists
  • LFU outperforms LRU when access patterns show clear popularity tiers, but suffers from frequency accumulation problems that may require aging mechanisms in long-running systems

Introduction to LFU Cache

Least Frequently Used (LFU) caching takes a fundamentally different approach than its more popular cousin, LRU. While LRU evicts the item that hasn’t been accessed for the longest time, LFU evicts the item that has been accessed the fewest times overall. This distinction matters enormously for certain workloads.

Consider a CDN serving static assets. Some files—your main JavaScript bundle, critical CSS, hero images—get hammered constantly. Others might be accessed once by a crawler and never again. LRU would keep that crawler-accessed file in cache if it happened recently, potentially evicting your heavily-used bundle. LFU keeps the popular content cached regardless of when it was last accessed.

LFU shines in scenarios with stable popularity distributions: database query caching where certain queries dominate, API rate limiting where you need to track frequent offenders, and recommendation systems where popular items should stay readily available.

The challenge with LFU is implementation complexity. A naive approach using a min-heap for frequencies gives you O(log n) operations. We want O(1) for both get() and put(). This requires careful data structure design.

Core Data Structures

The O(1) LFU implementation relies on three interconnected data structures working in concert. Understanding their relationships is crucial before diving into code.

First, we need a hash map from keys to nodes for O(1) lookup. Second, we maintain a frequency map where each frequency value points to a doubly linked list of all nodes with that access count. Third, we track the minimum frequency currently in the cache, enabling O(1) eviction.

The frequency map is the clever part. When we need to evict, we look at the minimum frequency bucket and remove from the tail (oldest item at that frequency). When an item’s frequency increases, we move it from one bucket to the next.

class Node:
    def __init__(self, key: int, value: int):
        self.key = key
        self.value = value
        self.freq = 1
        self.prev = None
        self.next = None


class DoublyLinkedList:
    def __init__(self):
        self.head = Node(0, 0)  # Dummy head
        self.tail = Node(0, 0)  # Dummy tail
        self.head.next = self.tail
        self.tail.prev = self.head
        self.size = 0


class LFUCache:
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.min_freq = 0
        self.key_to_node = {}  # key -> Node
        self.freq_to_list = {}  # frequency -> DoublyLinkedList

The dummy head and tail nodes in our doubly linked list eliminate edge cases when adding or removing nodes. The size attribute lets us check if a frequency bucket is empty in O(1) time.

Get Operation Implementation

The get() operation does more than retrieve a value—it must also increment the access frequency and migrate the node to the appropriate frequency bucket. This frequency update is what makes LFU tracking work.

def get(self, key: int) -> int:
    if key not in self.key_to_node:
        return -1
    
    node = self.key_to_node[key]
    self._update_frequency(node)
    return node.value


def _update_frequency(self, node: Node) -> None:
    old_freq = node.freq
    new_freq = old_freq + 1
    
    # Remove from current frequency list
    self._remove_node(node)
    
    # Check if we need to update min_freq
    if old_freq == self.min_freq and self.freq_to_list[old_freq].size == 0:
        self.min_freq = new_freq
    
    # Add to new frequency list
    node.freq = new_freq
    if new_freq not in self.freq_to_list:
        self.freq_to_list[new_freq] = DoublyLinkedList()
    self._add_to_head(self.freq_to_list[new_freq], node)

The critical detail here is updating min_freq. When we remove the last node from the minimum frequency bucket, we know the new minimum is exactly old_freq + 1—because we just moved that node there. This observation keeps the operation O(1) without scanning for a new minimum.

Put Operation Implementation

The put() operation handles three distinct cases: updating an existing key, inserting when at capacity (requiring eviction), and inserting when space is available.

def put(self, key: int, value: int) -> None:
    if self.capacity <= 0:
        return
    
    # Case 1: Key exists - update value and frequency
    if key in self.key_to_node:
        node = self.key_to_node[key]
        node.value = value
        self._update_frequency(node)
        return
    
    # Case 2: At capacity - evict least frequent
    if len(self.key_to_node) >= self.capacity:
        self._evict()
    
    # Case 3: Insert new node
    new_node = Node(key, value)
    self.key_to_node[key] = new_node
    
    if 1 not in self.freq_to_list:
        self.freq_to_list[1] = DoublyLinkedList()
    self._add_to_head(self.freq_to_list[1], new_node)
    
    # New nodes always have frequency 1
    self.min_freq = 1


def _evict(self) -> None:
    # Get the least frequently used list
    lfu_list = self.freq_to_list[self.min_freq]
    
    # Remove from tail (oldest among least frequent)
    node_to_remove = lfu_list.tail.prev
    self._remove_node(node_to_remove)
    
    # Clean up key mapping
    del self.key_to_node[node_to_remove.key]

Notice that when inserting a new node, we always set min_freq = 1. A new item has been accessed exactly once, so it’s guaranteed to be among the least frequently used items (or the only one at frequency 1).

The eviction targets the tail of the minimum frequency list. Since we add new nodes to the head, the tail contains the oldest item at that frequency—giving us LRU behavior as a tiebreaker among items with equal frequency.

Doubly Linked List Helper Operations

The doubly linked list operations are the foundation enabling O(1) node manipulation. Unlike singly linked lists where removal requires finding the previous node, doubly linked lists let us remove any node directly using its prev/next pointers.

def _add_to_head(self, dll: DoublyLinkedList, node: Node) -> None:
    node.next = dll.head.next
    node.prev = dll.head
    dll.head.next.prev = node
    dll.head.next = node
    dll.size += 1


def _remove_node(self, node: Node) -> None:
    node.prev.next = node.next
    node.next.prev = node.prev
    
    # Find which list this node belongs to and decrement size
    freq = node.freq
    self.freq_to_list[freq].size -= 1

The dummy head and tail nodes eliminate null checks. When adding to an empty list, dll.head.next is the tail, and the pointer manipulation works correctly. When removing the only real node, we’re left with head pointing to tail—exactly what we want.

Complete Implementation & Testing

Here’s the complete, working implementation with a test demonstrating the eviction behavior:

class Node:
    def __init__(self, key: int, value: int):
        self.key = key
        self.value = value
        self.freq = 1
        self.prev = None
        self.next = None


class DoublyLinkedList:
    def __init__(self):
        self.head = Node(0, 0)
        self.tail = Node(0, 0)
        self.head.next = self.tail
        self.tail.prev = self.head
        self.size = 0


class LFUCache:
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.min_freq = 0
        self.key_to_node = {}
        self.freq_to_list = {}
    
    def get(self, key: int) -> int:
        if key not in self.key_to_node:
            return -1
        node = self.key_to_node[key]
        self._update_frequency(node)
        return node.value
    
    def put(self, key: int, value: int) -> None:
        if self.capacity <= 0:
            return
        
        if key in self.key_to_node:
            node = self.key_to_node[key]
            node.value = value
            self._update_frequency(node)
            return
        
        if len(self.key_to_node) >= self.capacity:
            self._evict()
        
        new_node = Node(key, value)
        self.key_to_node[key] = new_node
        
        if 1 not in self.freq_to_list:
            self.freq_to_list[1] = DoublyLinkedList()
        self._add_to_head(self.freq_to_list[1], new_node)
        self.min_freq = 1
    
    def _update_frequency(self, node: Node) -> None:
        old_freq = node.freq
        new_freq = old_freq + 1
        self._remove_node(node)
        
        if old_freq == self.min_freq and self.freq_to_list[old_freq].size == 0:
            self.min_freq = new_freq
        
        node.freq = new_freq
        if new_freq not in self.freq_to_list:
            self.freq_to_list[new_freq] = DoublyLinkedList()
        self._add_to_head(self.freq_to_list[new_freq], node)
    
    def _evict(self) -> None:
        lfu_list = self.freq_to_list[self.min_freq]
        node_to_remove = lfu_list.tail.prev
        self._remove_node(node_to_remove)
        del self.key_to_node[node_to_remove.key]
    
    def _add_to_head(self, dll: DoublyLinkedList, node: Node) -> None:
        node.next = dll.head.next
        node.prev = dll.head
        dll.head.next.prev = node
        dll.head.next = node
        dll.size += 1
    
    def _remove_node(self, node: Node) -> None:
        node.prev.next = node.next
        node.next.prev = node.prev
        self.freq_to_list[node.freq].size -= 1


# Test the implementation
cache = LFUCache(2)
cache.put(1, 10)    # freq: {1: [1]}
cache.put(2, 20)    # freq: {1: [2, 1]}
print(cache.get(1)) # Returns 10, freq: {1: [2], 2: [1]}
cache.put(3, 30)    # Evicts key 2 (freq=1, oldest), freq: {1: [3], 2: [1]}
print(cache.get(2)) # Returns -1 (evicted)
print(cache.get(3)) # Returns 30, freq: {2: [3, 1]}
cache.put(4, 40)    # Evicts key 1 or 3? Both freq=2, evicts 1 (oldest)
print(cache.get(1)) # Returns -1 (evicted)
print(cache.get(3)) # Returns 30
print(cache.get(4)) # Returns 40

Complexity Analysis & Trade-offs

Time Complexity: Both get() and put() run in O(1) time. Hash map lookups, doubly linked list insertions/removals, and frequency bucket operations are all constant time.

Space Complexity: O(n) where n is the cache capacity. We store each item once in the key map, and the frequency lists collectively contain all n nodes.

When to choose LFU over LRU:

  • Workloads with stable popularity patterns (popular items stay popular)
  • When recency is less important than overall access patterns
  • CDN and static asset caching where hit rates matter more than freshness

When LRU wins:

  • Workloads with temporal locality (recently accessed items likely accessed again soon)
  • When popularity shifts frequently
  • Simpler implementation requirements

LFU Limitations:

The frequency accumulation problem is LFU’s Achilles heel. An item accessed 1000 times early in the cache’s lifetime will stay cached even if it’s never accessed again, while newer popular items get evicted. Solutions include:

  • LFU with aging: Periodically halve all frequencies
  • Window-based LFU: Only count accesses within a sliding time window
  • Adaptive replacement: Dynamically balance between LFU and LRU based on workload

For production systems with long-running caches, consider implementing frequency decay or using libraries that handle these edge cases. The pure LFU implementation shown here works well for shorter-lived caches or workloads where popularity truly remains stable.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.