LFU Cache: Least Frequently Used Implementation
Least Frequently Used (LFU) caching takes a fundamentally different approach than its more popular cousin, LRU. While LRU evicts the item that hasn't been accessed for the longest time, LFU evicts...
Key Insights
- LFU cache evicts the least frequently accessed item, making it ideal for workloads with stable popularity patterns like CDN caching and database query results
- Achieving O(1) time complexity for both get and put operations requires a clever combination of hash maps and frequency-bucketed doubly linked lists
- LFU outperforms LRU when access patterns show clear popularity tiers, but suffers from frequency accumulation problems that may require aging mechanisms in long-running systems
Introduction to LFU Cache
Least Frequently Used (LFU) caching takes a fundamentally different approach than its more popular cousin, LRU. While LRU evicts the item that hasn’t been accessed for the longest time, LFU evicts the item that has been accessed the fewest times overall. This distinction matters enormously for certain workloads.
Consider a CDN serving static assets. Some files—your main JavaScript bundle, critical CSS, hero images—get hammered constantly. Others might be accessed once by a crawler and never again. LRU would keep that crawler-accessed file in cache if it happened recently, potentially evicting your heavily-used bundle. LFU keeps the popular content cached regardless of when it was last accessed.
LFU shines in scenarios with stable popularity distributions: database query caching where certain queries dominate, API rate limiting where you need to track frequent offenders, and recommendation systems where popular items should stay readily available.
The challenge with LFU is implementation complexity. A naive approach using a min-heap for frequencies gives you O(log n) operations. We want O(1) for both get() and put(). This requires careful data structure design.
Core Data Structures
The O(1) LFU implementation relies on three interconnected data structures working in concert. Understanding their relationships is crucial before diving into code.
First, we need a hash map from keys to nodes for O(1) lookup. Second, we maintain a frequency map where each frequency value points to a doubly linked list of all nodes with that access count. Third, we track the minimum frequency currently in the cache, enabling O(1) eviction.
The frequency map is the clever part. When we need to evict, we look at the minimum frequency bucket and remove from the tail (oldest item at that frequency). When an item’s frequency increases, we move it from one bucket to the next.
class Node:
def __init__(self, key: int, value: int):
self.key = key
self.value = value
self.freq = 1
self.prev = None
self.next = None
class DoublyLinkedList:
def __init__(self):
self.head = Node(0, 0) # Dummy head
self.tail = Node(0, 0) # Dummy tail
self.head.next = self.tail
self.tail.prev = self.head
self.size = 0
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.min_freq = 0
self.key_to_node = {} # key -> Node
self.freq_to_list = {} # frequency -> DoublyLinkedList
The dummy head and tail nodes in our doubly linked list eliminate edge cases when adding or removing nodes. The size attribute lets us check if a frequency bucket is empty in O(1) time.
Get Operation Implementation
The get() operation does more than retrieve a value—it must also increment the access frequency and migrate the node to the appropriate frequency bucket. This frequency update is what makes LFU tracking work.
def get(self, key: int) -> int:
if key not in self.key_to_node:
return -1
node = self.key_to_node[key]
self._update_frequency(node)
return node.value
def _update_frequency(self, node: Node) -> None:
old_freq = node.freq
new_freq = old_freq + 1
# Remove from current frequency list
self._remove_node(node)
# Check if we need to update min_freq
if old_freq == self.min_freq and self.freq_to_list[old_freq].size == 0:
self.min_freq = new_freq
# Add to new frequency list
node.freq = new_freq
if new_freq not in self.freq_to_list:
self.freq_to_list[new_freq] = DoublyLinkedList()
self._add_to_head(self.freq_to_list[new_freq], node)
The critical detail here is updating min_freq. When we remove the last node from the minimum frequency bucket, we know the new minimum is exactly old_freq + 1—because we just moved that node there. This observation keeps the operation O(1) without scanning for a new minimum.
Put Operation Implementation
The put() operation handles three distinct cases: updating an existing key, inserting when at capacity (requiring eviction), and inserting when space is available.
def put(self, key: int, value: int) -> None:
if self.capacity <= 0:
return
# Case 1: Key exists - update value and frequency
if key in self.key_to_node:
node = self.key_to_node[key]
node.value = value
self._update_frequency(node)
return
# Case 2: At capacity - evict least frequent
if len(self.key_to_node) >= self.capacity:
self._evict()
# Case 3: Insert new node
new_node = Node(key, value)
self.key_to_node[key] = new_node
if 1 not in self.freq_to_list:
self.freq_to_list[1] = DoublyLinkedList()
self._add_to_head(self.freq_to_list[1], new_node)
# New nodes always have frequency 1
self.min_freq = 1
def _evict(self) -> None:
# Get the least frequently used list
lfu_list = self.freq_to_list[self.min_freq]
# Remove from tail (oldest among least frequent)
node_to_remove = lfu_list.tail.prev
self._remove_node(node_to_remove)
# Clean up key mapping
del self.key_to_node[node_to_remove.key]
Notice that when inserting a new node, we always set min_freq = 1. A new item has been accessed exactly once, so it’s guaranteed to be among the least frequently used items (or the only one at frequency 1).
The eviction targets the tail of the minimum frequency list. Since we add new nodes to the head, the tail contains the oldest item at that frequency—giving us LRU behavior as a tiebreaker among items with equal frequency.
Doubly Linked List Helper Operations
The doubly linked list operations are the foundation enabling O(1) node manipulation. Unlike singly linked lists where removal requires finding the previous node, doubly linked lists let us remove any node directly using its prev/next pointers.
def _add_to_head(self, dll: DoublyLinkedList, node: Node) -> None:
node.next = dll.head.next
node.prev = dll.head
dll.head.next.prev = node
dll.head.next = node
dll.size += 1
def _remove_node(self, node: Node) -> None:
node.prev.next = node.next
node.next.prev = node.prev
# Find which list this node belongs to and decrement size
freq = node.freq
self.freq_to_list[freq].size -= 1
The dummy head and tail nodes eliminate null checks. When adding to an empty list, dll.head.next is the tail, and the pointer manipulation works correctly. When removing the only real node, we’re left with head pointing to tail—exactly what we want.
Complete Implementation & Testing
Here’s the complete, working implementation with a test demonstrating the eviction behavior:
class Node:
def __init__(self, key: int, value: int):
self.key = key
self.value = value
self.freq = 1
self.prev = None
self.next = None
class DoublyLinkedList:
def __init__(self):
self.head = Node(0, 0)
self.tail = Node(0, 0)
self.head.next = self.tail
self.tail.prev = self.head
self.size = 0
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.min_freq = 0
self.key_to_node = {}
self.freq_to_list = {}
def get(self, key: int) -> int:
if key not in self.key_to_node:
return -1
node = self.key_to_node[key]
self._update_frequency(node)
return node.value
def put(self, key: int, value: int) -> None:
if self.capacity <= 0:
return
if key in self.key_to_node:
node = self.key_to_node[key]
node.value = value
self._update_frequency(node)
return
if len(self.key_to_node) >= self.capacity:
self._evict()
new_node = Node(key, value)
self.key_to_node[key] = new_node
if 1 not in self.freq_to_list:
self.freq_to_list[1] = DoublyLinkedList()
self._add_to_head(self.freq_to_list[1], new_node)
self.min_freq = 1
def _update_frequency(self, node: Node) -> None:
old_freq = node.freq
new_freq = old_freq + 1
self._remove_node(node)
if old_freq == self.min_freq and self.freq_to_list[old_freq].size == 0:
self.min_freq = new_freq
node.freq = new_freq
if new_freq not in self.freq_to_list:
self.freq_to_list[new_freq] = DoublyLinkedList()
self._add_to_head(self.freq_to_list[new_freq], node)
def _evict(self) -> None:
lfu_list = self.freq_to_list[self.min_freq]
node_to_remove = lfu_list.tail.prev
self._remove_node(node_to_remove)
del self.key_to_node[node_to_remove.key]
def _add_to_head(self, dll: DoublyLinkedList, node: Node) -> None:
node.next = dll.head.next
node.prev = dll.head
dll.head.next.prev = node
dll.head.next = node
dll.size += 1
def _remove_node(self, node: Node) -> None:
node.prev.next = node.next
node.next.prev = node.prev
self.freq_to_list[node.freq].size -= 1
# Test the implementation
cache = LFUCache(2)
cache.put(1, 10) # freq: {1: [1]}
cache.put(2, 20) # freq: {1: [2, 1]}
print(cache.get(1)) # Returns 10, freq: {1: [2], 2: [1]}
cache.put(3, 30) # Evicts key 2 (freq=1, oldest), freq: {1: [3], 2: [1]}
print(cache.get(2)) # Returns -1 (evicted)
print(cache.get(3)) # Returns 30, freq: {2: [3, 1]}
cache.put(4, 40) # Evicts key 1 or 3? Both freq=2, evicts 1 (oldest)
print(cache.get(1)) # Returns -1 (evicted)
print(cache.get(3)) # Returns 30
print(cache.get(4)) # Returns 40
Complexity Analysis & Trade-offs
Time Complexity: Both get() and put() run in O(1) time. Hash map lookups, doubly linked list insertions/removals, and frequency bucket operations are all constant time.
Space Complexity: O(n) where n is the cache capacity. We store each item once in the key map, and the frequency lists collectively contain all n nodes.
When to choose LFU over LRU:
- Workloads with stable popularity patterns (popular items stay popular)
- When recency is less important than overall access patterns
- CDN and static asset caching where hit rates matter more than freshness
When LRU wins:
- Workloads with temporal locality (recently accessed items likely accessed again soon)
- When popularity shifts frequently
- Simpler implementation requirements
LFU Limitations:
The frequency accumulation problem is LFU’s Achilles heel. An item accessed 1000 times early in the cache’s lifetime will stay cached even if it’s never accessed again, while newer popular items get evicted. Solutions include:
- LFU with aging: Periodically halve all frequencies
- Window-based LFU: Only count accesses within a sliding time window
- Adaptive replacement: Dynamically balance between LFU and LRU based on workload
For production systems with long-running caches, consider implementing frequency decay or using libraries that handle these edge cases. The pure LFU implementation shown here works well for shorter-lived caches or workloads where popularity truly remains stable.