Euler Tour: Tree to Array Transformation
Trees are everywhere in software engineering—file systems, organizational hierarchies, DOM structures, and countless algorithmic problems. But trees have an annoying property: they don't play well...
Key Insights
- Euler Tour transforms tree structures into arrays where subtrees become contiguous ranges, enabling O(log n) subtree queries using standard range data structures
- By recording entry (tin) and exit (tout) timestamps during DFS, you create a mapping that preserves hierarchical relationships in linear form
- Combined with segment trees or sparse tables, Euler Tour enables efficient subtree sum/min/max queries and O(1) LCA computation after O(n log n) preprocessing
Introduction: Why Flatten a Tree?
Trees are everywhere in software engineering—file systems, organizational hierarchies, DOM structures, and countless algorithmic problems. But trees have an annoying property: they don’t play well with range query data structures.
Consider this problem: given a tree where each node has a value, answer thousands of queries asking “what’s the sum of all values in node X’s subtree?” A naive approach traverses the entire subtree for each query—O(n) per query, which becomes O(n × q) for q queries. That’s unacceptable for large inputs.
The Euler Tour technique solves this by transforming the tree into an array where each subtree maps to a contiguous range. Once flattened, you can apply segment trees, Fenwick trees, or sparse tables to answer queries in O(log n) or even O(1) time.
This technique powers solutions for subtree aggregate queries, path queries, and Lowest Common Ancestor (LCA) computation. If you’re working on tree problems in competitive programming or building systems that query hierarchical data, Euler Tour is essential knowledge.
Understanding the Euler Tour Traversal
The core idea is simple: perform a DFS traversal and record two timestamps for each node—when you first enter it (tin) and when you finish processing it (tout). These timestamps have a crucial property: for any node, all its descendants have timestamps between tin and tout.
Think of it as recording your path through the tree. When you enter a node, stamp the current time. Explore all children recursively. When you’re done with the subtree, stamp the exit time. The magic is that a node’s entire subtree gets consecutive timestamps.
Here’s the basic implementation:
class EulerTour:
def __init__(self, n: int):
self.n = n
self.adj = [[] for _ in range(n)]
self.tin = [0] * n
self.tout = [0] * n
self.timer = 0
def add_edge(self, u: int, v: int):
self.adj[u].append(v)
self.adj[v].append(u)
def dfs(self, node: int, parent: int = -1):
self.tin[node] = self.timer
self.timer += 1
for neighbor in self.adj[node]:
if neighbor != parent:
self.dfs(neighbor, node)
self.tout[node] = self.timer - 1
def build(self, root: int = 0):
self.timer = 0
self.dfs(root, -1)
After running build(), every node has a tin/tout pair. For any node X, the range [tin[X], tout[X]] contains exactly the timestamps of X and all its descendants.
Building the Euler Tour Array
The timestamps alone are useful, but to answer aggregate queries, you need an array that maps positions to node values. The tour array stores node values at their entry positions.
class EulerTourWithValues:
def __init__(self, n: int, values: list[int]):
self.n = n
self.values = values
self.adj = [[] for _ in range(n)]
self.tin = [0] * n
self.tout = [0] * n
self.tour = [0] * n # Flattened array
self.tour_to_node = [0] * n # Map position back to node
self.timer = 0
def add_edge(self, u: int, v: int):
self.adj[u].append(v)
self.adj[v].append(u)
def dfs(self, node: int, parent: int = -1):
self.tin[node] = self.timer
self.tour[self.timer] = self.values[node]
self.tour_to_node[self.timer] = node
self.timer += 1
for neighbor in self.adj[node]:
if neighbor != parent:
self.dfs(neighbor, node)
self.tout[node] = self.timer - 1
def build(self, root: int = 0):
self.timer = 0
self.dfs(root, -1)
def get_subtree_range(self, node: int) -> tuple[int, int]:
"""Returns inclusive range [l, r] for subtree of node."""
return (self.tin[node], self.tout[node])
For trees given as parent arrays (common in many problems), the construction adapts slightly:
def build_from_parent_array(parent: list[int], values: list[int]) -> EulerTourWithValues:
n = len(parent)
et = EulerTourWithValues(n, values)
root = -1
for i in range(n):
if parent[i] == -1:
root = i
else:
et.add_edge(i, parent[i])
et.build(root)
return et
Subtree Queries with Segment Trees
Now for the payoff. With the Euler Tour array, subtree queries become range queries. Pair it with a segment tree for O(log n) operations:
class SegmentTree:
def __init__(self, arr: list[int]):
self.n = len(arr)
self.tree = [0] * (2 * self.n)
# Build leaves
for i in range(self.n):
self.tree[self.n + i] = arr[i]
# Build internal nodes
for i in range(self.n - 1, 0, -1):
self.tree[i] = self.tree[2 * i] + self.tree[2 * i + 1]
def update(self, pos: int, value: int):
pos += self.n
self.tree[pos] = value
while pos > 1:
pos //= 2
self.tree[pos] = self.tree[2 * pos] + self.tree[2 * pos + 1]
def query(self, left: int, right: int) -> int:
"""Sum of range [left, right] inclusive."""
result = 0
left += self.n
right += self.n + 1
while left < right:
if left & 1:
result += self.tree[left]
left += 1
if right & 1:
right -= 1
result += self.tree[right]
left //= 2
right //= 2
return result
class SubtreeQuerySolver:
def __init__(self, n: int, values: list[int]):
self.euler = EulerTourWithValues(n, values)
self.seg_tree = None
def add_edge(self, u: int, v: int):
self.euler.add_edge(u, v)
def build(self, root: int = 0):
self.euler.build(root)
self.seg_tree = SegmentTree(self.euler.tour)
def subtree_sum(self, node: int) -> int:
l, r = self.euler.get_subtree_range(node)
return self.seg_tree.query(l, r)
def update_node(self, node: int, new_value: int):
pos = self.euler.tin[node]
self.seg_tree.update(pos, new_value)
This handles subtree sum queries in O(log n) and point updates in O(log n). The same pattern works for min, max, or any associative operation.
Path Queries and LCA Applications
Euler Tour also enables efficient LCA computation. The key insight: in a DFS traversal that records nodes on both entry and exit (or records during traversal), the LCA of two nodes is the shallowest node between their first occurrences.
For LCA, we use a slightly different tour that records nodes multiple times:
class LCAEulerTour:
def __init__(self, n: int):
self.n = n
self.adj = [[] for _ in range(n)]
self.depth = [0] * n
self.first_occurrence = [0] * n
self.tour = [] # (depth, node) pairs
self.sparse_table = None
def add_edge(self, u: int, v: int):
self.adj[u].append(v)
self.adj[v].append(u)
def dfs(self, node: int, parent: int, d: int):
self.depth[node] = d
self.first_occurrence[node] = len(self.tour)
self.tour.append((d, node))
for neighbor in self.adj[node]:
if neighbor != parent:
self.dfs(neighbor, node, d + 1)
self.tour.append((d, node)) # Record on backtrack
def build_sparse_table(self):
m = len(self.tour)
k = m.bit_length()
self.sparse_table = [[None] * m for _ in range(k)]
# Initialize with tour values
for i in range(m):
self.sparse_table[0][i] = self.tour[i]
# Build sparse table for range minimum
for j in range(1, k):
for i in range(m - (1 << j) + 1):
left = self.sparse_table[j - 1][i]
right = self.sparse_table[j - 1][i + (1 << (j - 1))]
self.sparse_table[j][i] = min(left, right)
def build(self, root: int = 0):
self.dfs(root, -1, 0)
self.build_sparse_table()
def lca(self, u: int, v: int) -> int:
l = self.first_occurrence[u]
r = self.first_occurrence[v]
if l > r:
l, r = r, l
length = r - l + 1
k = length.bit_length() - 1
left_min = self.sparse_table[k][l]
right_min = self.sparse_table[k][r - (1 << k) + 1]
return min(left_min, right_min)[1]
After O(n log n) preprocessing, LCA queries run in O(1). This is often faster than binary lifting for query-heavy workloads.
Practical Considerations and Optimizations
Space complexity: The basic Euler Tour uses O(n) space for tin/tout arrays. The LCA variant uses O(2n) for the tour (each edge traversed twice) plus O(n log n) for the sparse table. For memory-constrained environments, consider binary lifting instead.
Common pitfalls: Off-by-one errors are rampant. Decide upfront whether your ranges are inclusive or exclusive. I recommend inclusive [tin, tout] ranges—they’re more intuitive for subtree queries.
0-indexing vs 1-indexing: Stick with 0-indexing throughout. Mixing conventions is a bug factory.
Dynamic trees: Euler Tour assumes a static tree. If you need to add/remove edges, look into Link-Cut Trees or Euler Tour Trees (a different data structure that maintains the tour dynamically).
Here’s a memory-efficient version that avoids redundant storage:
class CompactEulerTour:
__slots__ = ['tin', 'tout', 'tour', 'n']
def __init__(self, adj: list[list[int]], values: list[int], root: int = 0):
self.n = len(adj)
self.tin = [0] * self.n
self.tout = [0] * self.n
self.tour = [0] * self.n
# Iterative DFS to avoid stack overflow
timer = 0
stack = [(root, -1, False)]
while stack:
node, parent, processed = stack.pop()
if processed:
self.tout[node] = timer - 1
continue
self.tin[node] = timer
self.tour[timer] = values[node]
timer += 1
stack.append((node, parent, True))
for neighbor in reversed(adj[node]):
if neighbor != parent:
stack.append((neighbor, node, False))
The iterative version handles trees with depth exceeding Python’s recursion limit (typically 1000).
Conclusion
Euler Tour is the go-to technique when you need to answer subtree queries efficiently. It’s simpler than Heavy-Light Decomposition and more general than specialized structures.
Use Euler Tour when:
- You need subtree aggregate queries (sum, min, max)
- You want O(1) LCA after preprocessing
- Your tree is static or changes infrequently
Consider alternatives when:
- You need path queries between arbitrary nodes (use HLD)
- You’re finding tree centers or solving distance problems (use centroid decomposition)
- Your tree changes frequently (use Link-Cut Trees)
For practice, try these problems: SPOJ QTREE, Codeforces 620E (New Year Tree), and LeetCode’s tree query problems. Once you internalize the tin/tout mapping, you’ll see opportunities to apply it everywhere hierarchical data appears.