Trie Data Structure: Prefix Tree Implementation

Key Insights

Tries provide O(m) time complexity for insert, search, and delete operations where m is the key length—independent of how many keys are stored, unlike hash tables which can degrade with collisions.
The choice between array-based and hash map-based node implementations involves a direct memory-speed tradeoff: arrays offer faster access but waste memory on sparse character sets, while hash maps adapt to actual usage.
Deletion in tries requires careful consideration of shared prefixes—you can only remove nodes that aren’t part of other words, making recursive cleanup essential for memory efficiency.

Introduction to Tries

A trie (pronounced “try”) is a tree-based data structure optimized for storing and retrieving strings. The name comes from “reTRIEval,” though some pronounce it “tree” to emphasize its structure. It’s also called a prefix tree because all descendants of a node share a common prefix.

Here’s why tries matter: when you search for a string in a hash table, you compute a hash of the entire string. If you have a million words and want to find all words starting with “pre,” you’re stuck iterating through everything. A trie lets you navigate directly to the “pre” node and enumerate only the relevant subtree.

Operation	Array/List	Hash Table	Trie
Insert	O(n)	O(m) avg, O(nm) worst	O(m)
Search	O(nm)	O(m) avg, O(nm) worst	O(m)
Prefix Search	O(nm)	O(nm)	O(m + k)
Delete	O(nm)	O(m) avg	O(m)

n = number of strings, m = length of string, k = number of matches

The trie’s killer feature is prefix operations. Finding all words with a given prefix takes O(m) to reach the prefix node, then O(k) to collect k results. No other standard data structure matches this.

Trie Node Structure

A trie node needs three things: a way to access children, a flag indicating whether this node represents a complete word, and optionally a value for key-value storage.

The children structure is your first design decision. For lowercase English letters, a fixed array of 26 slots offers O(1) child access. For Unicode or sparse character sets, a hash map saves memory at the cost of slightly slower lookups.

class TrieNodeArray:
    """Array-based node for lowercase English letters only."""
    
    def __init__(self):
        self.children = [None] * 26  # Fixed size for a-z
        self.is_end_of_word = False
        self.value = None  # Optional: for key-value storage
    
    def _char_to_index(self, char: str) -> int:
        return ord(char) - ord('a')
    
    def get_child(self, char: str):
        return self.children[self._char_to_index(char)]
    
    def set_child(self, char: str, node):
        self.children[self._char_to_index(char)] = node


class TrieNodeMap:
    """Hash map-based node for any character set."""
    
    def __init__(self):
        self.children = {}  # Dynamic size
        self.is_end_of_word = False
        self.value = None
    
    def get_child(self, char: str):
        return self.children.get(char)
    
    def set_child(self, char: str, node):
        self.children[char] = node
    
    def has_children(self) -> bool:
        return len(self.children) > 0

The array approach allocates 26 pointers per node regardless of usage. If you’re storing a sparse dictionary, most slots sit empty. The hash map approach only allocates space for characters actually used, but adds hashing overhead.

For production systems handling English text, I recommend the hash map approach. The memory savings outweigh the minor performance cost, and you get Unicode support for free.

Core Operations: Insert

Insertion traverses the trie character by character, creating nodes as needed, then marks the final node as a word boundary.

class Trie:
    def __init__(self):
        self.root = TrieNodeMap()
    
    def insert(self, word: str, value=None) -> None:
        """
        Insert a word into the trie.
        Time: O(m) where m is word length
        Space: O(m) in worst case (all new nodes)
        """
        node = self.root
        
        for char in word:
            # Check if path exists for current character
            child = node.get_child(char)
            
            if child is None:
                # Create new node and link it
                child = TrieNodeMap()
                node.set_child(char, child)
            
            # Move down the trie
            node = child
        
        # Mark the end of the word
        node.is_end_of_word = True
        node.value = value  # Store associated value if provided

Let’s trace inserting “cat” then “car” into an empty trie:

Insert “cat”: Create nodes for c → a → t, mark ’t’ as end-of-word
Insert “car”: Traverse existing c → a, create ‘r’ node, mark ‘r’ as end-of-word

After both insertions, the ‘a’ node has two children (’t’ and ‘r’), demonstrating prefix sharing. This is where tries shine—common prefixes are stored once.

Core Operations: Search and StartsWith

These operations look similar but differ in their termination condition. Search requires an exact match; startsWith only needs the prefix to exist.

def search(self, word: str) -> bool:
    """
    Return True if word exists in trie (exact match).
    Time: O(m)
    """
    node = self._traverse_to(word)
    
    # Must reach end AND node must be marked as word
    return node is not None and node.is_end_of_word

def starts_with(self, prefix: str) -> bool:
    """
    Return True if any word in trie starts with prefix.
    Time: O(m)
    """
    node = self._traverse_to(prefix)
    
    # Only need to reach the end of prefix
    return node is not None

def _traverse_to(self, s: str):
    """
    Helper: traverse trie following string s.
    Returns final node or None if path doesn't exist.
    """
    node = self.root
    
    for char in s:
        child = node.get_child(char)
        if child is None:
            return None
        node = child
    
    return node

The critical difference is on the return line. For “car” in a trie containing only “cars”:

search("car") returns False—we reach the ‘r’ node but is_end_of_word is False
starts_with("car") returns True—the path exists, that’s all we need

This distinction trips up many implementations. Always verify your search checks the end-of-word flag.

Core Operations: Delete

Deletion is the trickiest operation. You can’t blindly remove nodes because they might be part of other words. Consider deleting “car” from a trie containing “car” and “cars”—you should unmark the ‘r’ node but not remove it.

def delete(self, word: str) -> bool:
    """
    Delete word from trie. Returns True if word existed.
    Time: O(m)
    """
    return self._delete_recursive(self.root, word, 0)

def _delete_recursive(self, node, word: str, depth: int) -> bool:
    """
    Recursively delete word and clean up unused nodes.
    Returns True if parent should delete this node.
    """
    if node is None:
        return False
    
    # Base case: reached end of word
    if depth == len(word):
        if not node.is_end_of_word:
            return False  # Word doesn't exist
        
        node.is_end_of_word = False
        
        # Delete node if it has no children
        return not node.has_children()
    
    char = word[depth]
    child = node.get_child(char)
    
    should_delete_child = self._delete_recursive(child, word, depth + 1)
    
    if should_delete_child:
        # Remove the child reference
        node.children.pop(char, None)
        
        # This node can be deleted if:
        # 1. It has no other children
        # 2. It's not the end of another word
        return not node.has_children() and not node.is_end_of_word
    
    return False

The recursion unwinds from the leaf toward the root, cleaning up nodes that are no longer needed. A node survives deletion if it either has remaining children or marks the end of another word.

Practical Applications

Autocomplete Systems

The most common trie application. When a user types “prog,” you traverse to that node and collect all words in the subtree.

def autocomplete(self, prefix: str, limit: int = 10) -> list:
    """
    Return up to `limit` words starting with prefix.
    Time: O(m + k) where k is number of matches collected
    """
    results = []
    node = self._traverse_to(prefix)
    
    if node is None:
        return results
    
    self._collect_words(node, prefix, results, limit)
    return results

def _collect_words(self, node, current_word: str, 
                   results: list, limit: int) -> None:
    """DFS to collect all words under this node."""
    if len(results) >= limit:
        return
    
    if node.is_end_of_word:
        results.append(current_word)
    
    for char, child in node.children.items():
        if len(results) >= limit:
            return
        self._collect_words(child, current_word + char, results, limit)

Spell Checkers

Tries enable efficient fuzzy matching. You can implement Levenshtein distance search by exploring multiple branches simultaneously, tracking edit distance as you go.

IP Routing Tables

Routers use tries (specifically, radix trees) to match IP prefixes. A packet’s destination IP is matched against stored prefixes to find the longest match, determining the next hop.

Word Games

Boggle solvers use tries to prune impossible paths early. If no word starts with “qz,” you can skip that branch entirely rather than checking every combination.

Optimizations and Variants

Compressed Tries (Radix Trees)

Standard tries waste nodes on non-branching paths. “internationalization” creates 20 nodes even if no other word shares its prefix. Radix trees compress single-child chains into single nodes storing multiple characters.

class RadixNode:
    """Node for a compressed trie (radix tree)."""
    
    def __init__(self, prefix: str = ""):
        self.prefix = prefix  # Edge label (multiple characters)
        self.children = {}
        self.is_end_of_word = False
        self.value = None

Radix trees reduce memory usage significantly for datasets with long unique prefixes. The tradeoff is more complex insertion and deletion logic—you may need to split or merge edge labels.

Ternary Search Tries

TSTs use three pointers per node (less than, equal, greater than) instead of an array or hash map. They offer a middle ground between binary search trees and standard tries, with better memory usage than array-based tries and faster lookups than hash map-based ones.

When to Use Each

Standard trie (hash map): General purpose, Unicode support, moderate memory usage
Standard trie (array): Maximum speed for fixed alphabets, memory-intensive
Radix tree: Long strings with unique prefixes, memory-constrained environments
Ternary search trie: Large alphabets where hash overhead matters

For most applications, start with a hash map-based standard trie. Profile before optimizing—the simpler implementation is easier to debug and often fast enough.