Robin Hood Hashing: Variance-Reducing Hash Table

Key Insights

Robin Hood hashing redistributes probe sequence lengths during insertion by swapping elements, dramatically reducing worst-case lookup times while maintaining the cache-friendly properties of linear probing.
Storing the probe sequence length (PSL) alongside each entry enables early termination during lookups—if you’ve probed further than any element could possibly be, the key doesn’t exist.
Backward shift deletion becomes practical with Robin Hood hashing because the PSL invariant makes it straightforward to identify which elements need repositioning.

Introduction to Open Addressing Pain Points

Linear probing is the simplest open addressing strategy: when a collision occurs, walk forward through the table until you find an empty slot. It’s cache-friendly, easy to implement, and works well at low load factors. But it has a nasty problem: variance.

Consider a hash table at 70% capacity. Most insertions find a slot within 2-3 probes. But some keys—the unlucky ones that hash into the middle of a cluster—might probe 15, 20, or even 50 slots before finding an opening. This creates unpredictable performance. Your average case looks great in benchmarks, but production traffic occasionally hits those pathological keys and latency spikes.

The core issue is that linear probing creates clusters, and clusters grow faster than they shrink. A key that hashes into an existing cluster extends it, making future collisions even more likely to land there. The rich get richer: long probe sequences attract more elements, becoming longer still.

Robin Hood hashing attacks this variance directly. Instead of accepting that some keys suffer while others thrive, it redistributes the pain. The result is a hash table where the difference between best-case and worst-case probe lengths shrinks dramatically.

The Robin Hood Principle

The insight is elegant: during insertion, track how far each element has traveled from its ideal position (its probe sequence length, or PSL). When you’re probing for an empty slot and encounter an element with a shorter PSL than your current displacement, swap them. You take the “rich” element’s spot and force it to continue searching.

This is wealth redistribution for hash tables. Elements that got lucky (low PSL) give up their positions to elements that got unlucky (high PSL). Over many insertions, this evens out the probe length distribution.

Here’s the core insertion logic:

def robin_hood_insert(table, key, value, hash_fn):
    idx = hash_fn(key) % len(table)
    psl = 0  # probe sequence length for the element we're inserting
    
    entry = Entry(key, value, psl)
    
    while True:
        if table[idx] is None:
            # Found empty slot, insert here
            table[idx] = entry
            return
        
        if table[idx].key == key:
            # Key exists, update value
            table[idx].value = entry.value
            return
        
        # Robin Hood swap: if current element is "richer" (lower PSL), take its spot
        if table[idx].psl < entry.psl:
            entry, table[idx] = table[idx], entry
        
        # Continue probing
        idx = (idx + 1) % len(table)
        entry.psl += 1

The swap is the key operation. After the swap, we continue inserting the displaced element, which now has a higher PSL than it did before. This cascades through the table, but each swap moves us closer to equilibrium.

Implementation Deep Dive

A complete implementation needs to track PSL for every entry. Here’s a full Robin Hood hash table in Python:

from dataclasses import dataclass
from typing import Optional, TypeVar, Generic

K = TypeVar('K')
V = TypeVar('V')

@dataclass
class Entry(Generic[K, V]):
    key: K
    value: V
    psl: int  # probe sequence length

class RobinHoodHashTable(Generic[K, V]):
    def __init__(self, capacity: int = 16, max_load: float = 0.7):
        self.capacity = capacity
        self.max_load = max_load
        self.size = 0
        self.table: list[Optional[Entry[K, V]]] = [None] * capacity
    
    def _hash(self, key: K) -> int:
        return hash(key) % self.capacity
    
    def insert(self, key: K, value: V) -> None:
        if self.size >= self.capacity * self.max_load:
            self._resize()
        
        idx = self._hash(key)
        entry = Entry(key, value, 0)
        
        while True:
            if self.table[idx] is None:
                self.table[idx] = entry
                self.size += 1
                return
            
            if self.table[idx].key == key:
                self.table[idx].value = value
                return
            
            # Robin Hood: steal from the rich
            if self.table[idx].psl < entry.psl:
                entry, self.table[idx] = self.table[idx], entry
            
            idx = (idx + 1) % self.capacity
            entry.psl += 1
    
    def get(self, key: K) -> Optional[V]:
        idx = self._hash(key)
        psl = 0
        
        while self.table[idx] is not None:
            if self.table[idx].key == key:
                return self.table[idx].value
            
            # Early termination: if we've probed further than this element's PSL,
            # the key can't exist (it would have been swapped earlier)
            if psl > self.table[idx].psl:
                return None
            
            idx = (idx + 1) % self.capacity
            psl += 1
        
        return None
    
    def _resize(self) -> None:
        old_table = self.table
        self.capacity *= 2
        self.table = [None] * self.capacity
        self.size = 0
        
        for entry in old_table:
            if entry is not None:
                self.insert(entry.key, entry.value)

The get method demonstrates a crucial optimization: early termination using PSL. If we’ve probed n slots and encounter an element with PSL less than n, we know our target key doesn’t exist. Why? Because if it did exist at this position or beyond, it would have displaced this “richer” element during insertion.

This bounds our worst-case lookup to the maximum PSL in the table, which Robin Hood keeps small.

Deletion Strategies: Backward Shift vs Tombstones

Deletion in open addressing is tricky. The naive approach—just mark the slot as empty—breaks lookups. Elements that probed past this slot during insertion now become unreachable.

Tombstones solve this: mark deleted slots specially, treat them as occupied during lookups but available during insertions. Simple, but tombstones accumulate and degrade performance over time.

Robin Hood enables a better approach: backward shift deletion. After removing an element, shift subsequent elements backward to fill the gap, but only those that can move closer to their ideal position.

def delete(self, key: K) -> bool:
    idx = self._hash(key)
    psl = 0
    
    # Find the element
    while self.table[idx] is not None:
        if self.table[idx].key == key:
            break
        if psl > self.table[idx].psl:
            return False  # Not found
        idx = (idx + 1) % self.capacity
        psl += 1
    
    if self.table[idx] is None:
        return False
    
    # Backward shift: pull elements back to fill the gap
    self.table[idx] = None
    self.size -= 1
    
    next_idx = (idx + 1) % self.capacity
    while self.table[next_idx] is not None and self.table[next_idx].psl > 0:
        # Move element back one slot, decrease its PSL
        self.table[idx] = self.table[next_idx]
        self.table[idx].psl -= 1
        self.table[next_idx] = None
        
        idx = next_idx
        next_idx = (next_idx + 1) % self.capacity
    
    return True

The key insight: an element with PSL > 0 isn’t in its ideal position, so it can shift backward. An element with PSL = 0 is home; shifting it would break lookups. This gives us a clean termination condition.

Performance Characteristics

Let’s quantify the variance reduction. Here’s a comparison of probe length distributions:

import random
from collections import Counter

def measure_probe_distribution(table_class, n_elements: int, capacity: int):
    table = table_class(capacity)
    
    # Insert random elements
    for i in range(n_elements):
        table.insert(random.randint(0, 10**9), i)
    
    # Collect PSL distribution
    psls = [entry.psl for entry in table.table if entry is not None]
    return {
        'max': max(psls),
        'mean': sum(psls) / len(psls),
        'variance': sum((p - sum(psls)/len(psls))**2 for p in psls) / len(psls),
        'distribution': Counter(psls)
    }

# Typical results at 70% load factor:
# Standard linear probing: max PSL ~25-40, high variance
# Robin Hood: max PSL ~8-12, variance reduced by 60-80%

The tradeoff is clear: Robin Hood does more work during insertion (swaps add overhead), but lookups become more predictable. For read-heavy workloads, this is a significant win.

Memory overhead is minimal—typically 1-2 bytes per entry for the PSL field. Cache performance remains excellent because we’re still doing linear probing; the access pattern is sequential.

Practical Considerations and Variants

Load factor: Robin Hood handles higher load factors better than standard linear probing. You can push to 80-85% before performance degrades significantly, compared to 70% for vanilla linear probing.

Resize strategy: Double the capacity when load exceeds threshold. Robin Hood’s insertion is idempotent, so resizing is straightforward—just reinsert all elements.

Smart search termination: The PSL-based early termination we implemented is sometimes called “Robin Hood with smart search.” It’s essential for good worst-case performance.

When to use Robin Hood: Choose it when you need predictable latency and can tolerate slightly slower insertions. It’s excellent for read-heavy caches, symbol tables, and any scenario where tail latency matters. Avoid it for write-heavy workloads where the swap overhead dominates.

Modern alternatives like Swiss Tables (used in Abseil and Rust’s hashbrown) use SIMD for parallel probing, which changes the tradeoff calculus. But for straightforward implementations without SIMD, Robin Hood remains one of the best variance-reducing techniques available.

Conclusion

Robin Hood hashing exemplifies a recurring theme in systems design: trading average-case efficiency for worst-case predictability. The insertion overhead is real but modest—a few extra comparisons and occasional swaps. The payoff is dramatic: maximum probe lengths drop by 50-70%, and the variance in lookup times shrinks correspondingly.

For systems where consistent performance matters more than raw throughput, Robin Hood is a compelling choice. The implementation complexity is manageable, the memory overhead is negligible, and the debugging experience improves when you know lookups won’t suddenly take 10x longer due to an unlucky hash collision.

The PSL field also unlocks optimizations that aren’t possible with standard linear probing: early termination during lookups and clean backward-shift deletion. These aren’t just nice-to-haves; they’re what make Robin Hood practical for production use.