Trie vs Hash Map: When to Use Which
Every developer reaches for a hash map by default. It's the Swiss Army knife of data structures—fast, familiar, and available in every language's standard library. But this default choice becomes a...
Key Insights
- Hash maps offer O(1) average-case lookups but fall apart when you need prefix matching, ordered iteration, or predictable worst-case performance
- Tries excel at prefix-based operations like autocomplete and spell-checking, trading memory overhead for zero hash collisions and natural lexicographic ordering
- Your choice should depend on your access patterns: if you’re doing exact lookups on mixed key types, use a hash map; if you’re doing prefix searches on strings, use a trie
Introduction
Every developer reaches for a hash map by default. It’s the Swiss Army knife of data structures—fast, familiar, and available in every language’s standard library. But this default choice becomes a liability when your use case involves prefix matching, autocomplete, or any operation where you need to find keys that start with something rather than match exactly.
Tries (pronounced “try,” from retrieval) solve problems that hash maps simply cannot handle efficiently. Understanding when to use each structure separates developers who write adequate code from those who write performant systems.
This isn’t an academic exercise. Autocomplete systems, IP routing tables, spell checkers, and DNS lookups all rely on making the right choice between these structures. Let’s break down when each one earns its place in your codebase.
How Each Structure Works
Hash Maps
A hash map stores key-value pairs by computing a hash of the key, then using that hash to determine a bucket location. When you insert "apple": 5, the hash function converts "apple" to an integer, mods it by the bucket count, and stores the pair at that index.
Collisions occur when two keys hash to the same bucket. Most implementations handle this through chaining (linked lists at each bucket) or open addressing (probing for the next empty slot).
Tries
A trie stores strings character by character in a tree structure. Each node represents a character, and paths from root to leaf spell out complete keys. The word “cat” occupies three nodes: c → a → t, with a marker indicating “cat” is a complete word.
Here’s a minimal implementation of both:
class HashMap:
def __init__(self, size=1024):
self.size = size
self.buckets = [[] for _ in range(size)]
def _hash(self, key):
return hash(key) % self.size
def insert(self, key, value):
bucket = self.buckets[self._hash(key)]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value)
return
bucket.append((key, value))
def get(self, key):
bucket = self.buckets[self._hash(key)]
for k, v in bucket:
if k == key:
return v
return None
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
self.value = None
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, key, value):
node = self.root
for char in key:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
node.value = value
def get(self, key):
node = self.root
for char in key:
if char not in node.children:
return None
node = node.children[char]
return node.value if node.is_end else None
The hash map relies on a good hash function and sufficient bucket count. The trie builds a tree where shared prefixes share nodes—“cat” and “car” share the c → a path.
Time & Space Complexity Comparison
Let n be the number of keys and m be the average key length.
| Operation | Hash Map (Average) | Hash Map (Worst) | Trie |
|---|---|---|---|
| Insert | O(m) | O(n × m) | O(m) |
| Lookup | O(m) | O(n × m) | O(m) |
| Delete | O(m) | O(n × m) | O(m) |
| Prefix Search | O(n × m) | O(n × m) | O(p + k) |
| Space | O(n × m) | O(n × m) | O(n × m × alphabet) |
The critical insight: hash map operations include O(m) for hashing the key, which people often forget. The “O(1)” lookup is only true when treating key comparison as constant time.
For prefix searches, p is the prefix length and k is the number of matching keys. A trie finds all words starting with “pre” by traversing three nodes, then collecting descendants. A hash map must scan every key.
Space complexity favors hash maps for sparse datasets. Tries allocate nodes for every character, and each node carries pointer overhead. With a 26-character alphabet, empty child pointers waste significant memory.
Where Tries Win
Tries dominate when your operations involve prefixes, ordering, or when you need guaranteed performance.
Prefix matching is the killer feature. Finding all keys starting with a prefix requires O(prefix_length + results) time, regardless of total keys stored. Hash maps require a full scan.
Lexicographic ordering comes free. A depth-first traversal of a trie yields keys in sorted order. Hash maps provide no ordering guarantees.
No hash collisions means predictable worst-case performance. Critical for real-time systems where latency spikes are unacceptable.
Here’s autocomplete implemented with a trie:
class AutocompleteTrie:
def __init__(self):
self.root = TrieNode()
def insert(self, word, weight=1):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end = True
node.value = weight
def _collect_words(self, node, prefix, results, limit):
if len(results) >= limit:
return
if node.is_end:
results.append((prefix, node.value))
for char in sorted(node.children.keys()):
self._collect_words(
node.children[char],
prefix + char,
results,
limit
)
def autocomplete(self, prefix, limit=10):
node = self.root
for char in prefix:
if char not in node.children:
return []
node = node.children[char]
results = []
self._collect_words(node, prefix, results, limit)
return results
# Usage
trie = AutocompleteTrie()
words = ["apple", "application", "apply", "banana", "band", "bandana"]
for word in words:
trie.insert(word)
print(trie.autocomplete("app")) # [('apple', 1), ('application', 1), ('apply', 1)]
print(trie.autocomplete("ban")) # [('banana', 1), ('band', 1), ('bandana', 1)]
Try implementing this efficiently with a hash map. You can’t—you’d need to iterate every key and check startswith().
Where Hash Maps Win
Hash maps remain the right choice for most general-purpose scenarios.
Simple key-value lookups are hash maps’ bread and butter. If you’re only doing exact matches, the simpler implementation and better cache locality of hash maps win.
Non-string keys disqualify tries entirely. Integers, tuples, custom objects—hash maps handle them all. Tries fundamentally require sequential, character-like keys.
Memory efficiency favors hash maps for sparse datasets or short keys. A hash map storing 1000 random 4-character strings uses far less memory than a trie with the same data.
Here’s a word frequency counter—a perfect hash map use case:
from collections import defaultdict
def word_frequency(text):
freq = defaultdict(int)
for word in text.lower().split():
# Strip punctuation
word = ''.join(c for c in word if c.isalnum())
if word:
freq[word] += 1
return freq
def top_words(freq, n=10):
return sorted(freq.items(), key=lambda x: -x[1])[:n]
# Usage
text = """
The quick brown fox jumps over the lazy dog.
The dog was not amused. The fox was quick to escape.
"""
freq = word_frequency(text)
print(top_words(freq, 5))
# [('the', 4), ('fox', 2), ('quick', 2), ('dog', 2), ('was', 2)]
No prefix operations, no ordering requirements, just counting. A hash map is the obvious choice.
Real-World Decision Framework
Use this decision tree:
- Do you need prefix matching? → Trie
- Do you need keys in sorted order? → Trie
- Are your keys non-strings? → Hash Map
- Do you need guaranteed worst-case performance? → Trie
- Is memory constrained and data sparse? → Hash Map
- Everything else → Hash Map
Here’s a spell-checker that demonstrates both approaches:
class SpellChecker:
def __init__(self, dictionary_words):
# Hash map for exact lookup
self.word_set = set(dictionary_words)
# Trie for prefix-based suggestions
self.trie = AutocompleteTrie()
for word in dictionary_words:
self.trie.insert(word)
def is_valid(self, word):
"""O(m) exact lookup - hash map wins"""
return word.lower() in self.word_set
def suggest_completions(self, prefix, limit=5):
"""O(p + k) prefix search - trie wins"""
return [word for word, _ in self.trie.autocomplete(prefix.lower(), limit)]
def check_text(self, text):
results = []
for word in text.split():
clean = ''.join(c for c in word.lower() if c.isalnum())
if not clean:
continue
if self.is_valid(clean):
results.append((word, True, []))
else:
suggestions = self.suggest_completions(clean[:3])
results.append((word, False, suggestions))
return results
# Usage
dictionary = ["hello", "help", "helper", "helping", "world", "word", "work"]
checker = SpellChecker(dictionary)
text = "helo wrld helping"
for word, valid, suggestions in checker.check_text(text):
if valid:
print(f"✓ {word}")
else:
print(f"✗ {word} → suggestions: {suggestions}")
# Output:
# ✗ helo → suggestions: ['hello', 'help', 'helper', 'helping']
# ✗ wrld → suggestions: ['word', 'work', 'world']
# ✓ helping
This hybrid approach uses each structure for its strength: hash set for O(1) validity checks, trie for prefix-based suggestions.
Conclusion
| Factor | Hash Map | Trie |
|---|---|---|
| Exact lookup | Excellent | Good |
| Prefix search | Poor | Excellent |
| Memory (dense) | Good | Moderate |
| Memory (sparse) | Excellent | Poor |
| Non-string keys | Yes | No |
| Worst-case predictability | Poor | Excellent |
| Implementation complexity | Low | Moderate |
Don’t forget hybrid structures exist. Radix trees (compressed tries) reduce memory overhead by collapsing single-child chains. Hash array mapped tries (HAMTs) combine hashing with trie structure for persistent data structures. Patricia tries optimize for long keys with common prefixes.
The right choice depends on your access patterns. Profile with real data, but start with this heuristic: if you’re ever calling startswith() in a loop, you probably want a trie.