Compressed Trie: Patricia Tree and Radix Tree

Key Insights

Compressed tries eliminate single-child nodes by storing multiple characters per edge, reducing memory overhead by 50-90% compared to standard tries while maintaining O(k) lookup time where k is key length.
Patricia trees are binary radix trees that compare bits rather than characters, making them ideal for fixed-width data like IP addresses, while general radix trees excel at string-based lookups.
The Linux kernel, Redis, and most web framework routers use radix trees internally—understanding their mechanics helps you make informed decisions about data structure selection.

Introduction to Compressed Tries

Standard tries waste enormous amounts of memory. Consider storing the words “application”, “applicant”, and “apply” in a traditional trie. You’d create 11 nodes just for the shared prefix “applic”, with each node containing only a single child pointer. This pattern of long chains with single-child nodes is the norm, not the exception.

Compressed tries solve this by collapsing these chains. Instead of one character per edge, compressed tries store entire strings. The three words above would share a single edge labeled “applic” before branching. This simple optimization typically reduces node count by 50-90%, depending on key distribution.

The two main variants—radix trees and Patricia trees—take different approaches to compression. Understanding both helps you choose the right tool for your specific problem.

Radix Tree Fundamentals

A radix tree (also called a compact prefix tree) compresses paths where nodes have only one child. Each edge stores a string rather than a single character, and nodes only exist at branching points or key terminations.

The core insight is simple: if a node has exactly one child, merge it with that child. The edge label becomes the concatenation of both original edges.

Here’s a basic radix tree implementation:

class RadixNode:
    def __init__(self):
        self.children = {}  # edge_label -> child_node
        self.is_end = False
        self.value = None   # optional: store associated data

class RadixTree:
    def __init__(self):
        self.root = RadixNode()
    
    def insert(self, key, value=None):
        node = self.root
        i = 0
        
        while i < len(key):
            # Find edge with matching prefix
            match_edge = None
            match_len = 0
            
            for edge in node.children:
                common = self._common_prefix_length(edge, key[i:])
                if common > 0:
                    match_edge = edge
                    match_len = common
                    break
            
            if match_edge is None:
                # No matching edge; create new one
                node.children[key[i:]] = RadixNode()
                node.children[key[i:]].is_end = True
                node.children[key[i:]].value = value
                return
            
            if match_len == len(match_edge):
                # Full edge match; continue to child
                node = node.children[match_edge]
                i += match_len
            else:
                # Partial match; split the edge
                self._split_edge(node, match_edge, match_len, key[i:], value)
                return
        
        node.is_end = True
        node.value = value
    
    def _split_edge(self, parent, edge, split_pos, remaining_key, value):
        old_child = parent.children[edge]
        
        # Create intermediate node
        new_node = RadixNode()
        parent.children[edge[:split_pos]] = new_node
        del parent.children[edge]
        
        # Reattach old child with remaining edge
        new_node.children[edge[split_pos:]] = old_child
        
        # Add new key's remaining portion
        if len(remaining_key) > split_pos:
            new_child = RadixNode()
            new_child.is_end = True
            new_child.value = value
            new_node.children[remaining_key[split_pos:]] = new_child
        else:
            new_node.is_end = True
            new_node.value = value
    
    def _common_prefix_length(self, s1, s2):
        i = 0
        while i < len(s1) and i < len(s2) and s1[i] == s2[i]:
            i += 1
        return i

The key complexity lies in edge splitting. When inserting “test” into a tree containing “testing”, you must split the “testing” edge into “test” and “ing”, creating an intermediate node.

Patricia Trees Explained

PATRICIA (Practical Algorithm to Retrieve Information Coded in Alphanumeric) takes compression further by operating at the bit level. Instead of storing edge labels, Patricia trees store a “skip count”—the bit position to examine next.

This design emerged from Donald Morrison’s 1968 work on information retrieval. The key innovation: nodes don’t store the actual key data. Instead, they store only the bit index for comparison, and the full key is stored only at leaf nodes.

class PatriciaNode:
    def __init__(self, bit_index=-1, key=None, value=None):
        self.bit_index = bit_index  # which bit to test
        self.key = key              # full key (only meaningful at leaves)
        self.value = value
        self.left = None            # bit is 0
        self.right = None           # bit is 1

class PatriciaTree:
    def __init__(self):
        self.root = None
    
    def _get_bit(self, key, index):
        """Extract bit at given index from key."""
        if index < 0:
            return 0
        byte_index = index // 8
        bit_offset = 7 - (index % 8)
        if byte_index >= len(key):
            return 0
        return (key[byte_index] >> bit_offset) & 1
    
    def _first_differing_bit(self, key1, key2):
        """Find first bit position where keys differ."""
        max_len = max(len(key1), len(key2)) * 8
        for i in range(max_len):
            if self._get_bit(key1, i) != self._get_bit(key2, i):
                return i
        return -1
    
    def search(self, key):
        if self.root is None:
            return None
        
        key_bytes = key.encode() if isinstance(key, str) else key
        node = self.root
        prev = None
        
        # Traverse until we go "up" (bit_index decreases)
        while node is not None and (prev is None or node.bit_index > prev.bit_index):
            prev = node
            if self._get_bit(key_bytes, node.bit_index) == 0:
                node = node.left
            else:
                node = node.right
        
        # Verify the key matches
        if node is not None and node.key == key_bytes:
            return node.value
        return None

The search operation is elegant: traverse based on bit values until you reach a node whose bit index is less than or equal to the parent’s. This indicates you’ve found a “back edge” pointing to the actual key location.

Key Operations: Insert, Search, Delete

Let’s complete the radix tree with all operations:

class RadixTree:
    # ... (previous code) ...
    
    def search(self, key):
        node = self.root
        i = 0
        
        while i < len(key):
            found = False
            for edge, child in node.children.items():
                if key[i:].startswith(edge):
                    node = child
                    i += len(edge)
                    found = True
                    break
                elif edge.startswith(key[i:]):
                    # Key is prefix of edge; key not in tree
                    return None
            
            if not found:
                return None
        
        return node.value if node.is_end else None
    
    def delete(self, key):
        """Delete key and merge nodes if possible."""
        # Find the node and its parent chain
        path = []  # [(parent, edge_label, node), ...]
        node = self.root
        i = 0
        
        while i < len(key):
            for edge, child in node.children.items():
                if key[i:].startswith(edge):
                    path.append((node, edge, child))
                    node = child
                    i += len(edge)
                    break
            else:
                return False  # Key not found
        
        if not node.is_end:
            return False
        
        node.is_end = False
        node.value = None
        
        # Merge nodes bottom-up
        self._try_merge(path)
        return True
    
    def _try_merge(self, path):
        """Merge single-child nodes after deletion."""
        while path:
            parent, edge, node = path.pop()
            
            if node.is_end:
                break
            
            if len(node.children) == 0:
                # Remove empty leaf
                del parent.children[edge]
            elif len(node.children) == 1:
                # Merge with single child
                child_edge, child = next(iter(node.children.items()))
                del parent.children[edge]
                parent.children[edge + child_edge] = child

Complexity analysis:

Search: O(k) where k is key length
Insert: O(k) with potential edge split
Delete: O(k) with potential node merge

The constant factors are higher than standard tries due to string comparisons, but memory savings usually outweigh this cost.

Radix Tree vs. Patricia Tree: When to Use Each

Choose Patricia trees when:

Working with binary data or fixed-width keys (IP addresses, hashes)
Memory is extremely constrained
Keys share long common prefixes at the bit level
You need predictable memory usage regardless of key content

Choose radix trees when:

Working with human-readable strings
Keys are variable-length text (URLs, words, paths)
You need to enumerate keys with a common prefix
Implementation simplicity matters

Patricia trees use less memory (no edge labels stored) but are harder to implement correctly and debug. Radix trees are more intuitive and work naturally with string data.

Real-World Applications

Here’s a practical URL router using a radix tree:

class URLRouter:
    def __init__(self):
        self.tree = RadixTree()
    
    def add_route(self, path, handler):
        # Normalize path
        path = path.rstrip('/')
        if not path:
            path = '/'
        self.tree.insert(path, handler)
    
    def match(self, path):
        path = path.rstrip('/')
        if not path:
            path = '/'
        return self.tree.search(path)

# Usage
router = URLRouter()
router.add_route('/api/users', 'users_handler')
router.add_route('/api/users/profile', 'profile_handler')
router.add_route('/api/posts', 'posts_handler')

handler = router.match('/api/users/profile')  # Returns 'profile_handler'

The Linux kernel uses radix trees for its page cache, mapping file offsets to memory pages. Redis uses them for cluster slot mapping. Go’s httprouter and Rust’s actix-web both use radix trees for URL routing, achieving sub-microsecond route matching.

Performance Considerations and Trade-offs

Compressed tries aren’t always the answer. Consider these trade-offs:

Memory: Radix trees beat standard tries and often beat hash maps for string keys with shared prefixes. However, if keys share no prefixes, you’re paying overhead for no benefit.

Speed: Hash maps offer O(1) average lookup versus O(k) for tries. For short keys or when you don’t need prefix operations, hash maps win.

Cache locality: Radix trees have poor cache behavior compared to arrays or hash tables. Each node access potentially causes a cache miss. For hot paths, consider flattening frequently-accessed subtrees.

When to avoid compressed tries:

Random, unrelated keys with no shared prefixes
Extremely short keys (hash map overhead is lower)
When you only need exact match, never prefix operations
Memory-mapped or disk-based storage (pointer chasing is expensive)

The right choice depends on your access patterns. Profile with realistic data before committing to any data structure.