Euler Tour: Tree to Array Transformation

Key Insights

Euler Tour transforms tree structures into arrays where subtrees become contiguous ranges, enabling O(log n) subtree queries using standard range data structures
By recording entry (tin) and exit (tout) timestamps during DFS, you create a mapping that preserves hierarchical relationships in linear form
Combined with segment trees or sparse tables, Euler Tour enables efficient subtree sum/min/max queries and O(1) LCA computation after O(n log n) preprocessing

Introduction: Why Flatten a Tree?

Trees are everywhere in software engineering—file systems, organizational hierarchies, DOM structures, and countless algorithmic problems. But trees have an annoying property: they don’t play well with range query data structures.

Consider this problem: given a tree where each node has a value, answer thousands of queries asking “what’s the sum of all values in node X’s subtree?” A naive approach traverses the entire subtree for each query—O(n) per query, which becomes O(n × q) for q queries. That’s unacceptable for large inputs.

The Euler Tour technique solves this by transforming the tree into an array where each subtree maps to a contiguous range. Once flattened, you can apply segment trees, Fenwick trees, or sparse tables to answer queries in O(log n) or even O(1) time.

This technique powers solutions for subtree aggregate queries, path queries, and Lowest Common Ancestor (LCA) computation. If you’re working on tree problems in competitive programming or building systems that query hierarchical data, Euler Tour is essential knowledge.

Understanding the Euler Tour Traversal

The core idea is simple: perform a DFS traversal and record two timestamps for each node—when you first enter it (tin) and when you finish processing it (tout). These timestamps have a crucial property: for any node, all its descendants have timestamps between tin and tout.

Think of it as recording your path through the tree. When you enter a node, stamp the current time. Explore all children recursively. When you’re done with the subtree, stamp the exit time. The magic is that a node’s entire subtree gets consecutive timestamps.

Here’s the basic implementation:

class EulerTour:
    def __init__(self, n: int):
        self.n = n
        self.adj = [[] for _ in range(n)]
        self.tin = [0] * n
        self.tout = [0] * n
        self.timer = 0
    
    def add_edge(self, u: int, v: int):
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def dfs(self, node: int, parent: int = -1):
        self.tin[node] = self.timer
        self.timer += 1
        
        for neighbor in self.adj[node]:
            if neighbor != parent:
                self.dfs(neighbor, node)
        
        self.tout[node] = self.timer - 1
    
    def build(self, root: int = 0):
        self.timer = 0
        self.dfs(root, -1)

After running build(), every node has a tin/tout pair. For any node X, the range [tin[X], tout[X]] contains exactly the timestamps of X and all its descendants.

Building the Euler Tour Array

The timestamps alone are useful, but to answer aggregate queries, you need an array that maps positions to node values. The tour array stores node values at their entry positions.

class EulerTourWithValues:
    def __init__(self, n: int, values: list[int]):
        self.n = n
        self.values = values
        self.adj = [[] for _ in range(n)]
        self.tin = [0] * n
        self.tout = [0] * n
        self.tour = [0] * n  # Flattened array
        self.tour_to_node = [0] * n  # Map position back to node
        self.timer = 0
    
    def add_edge(self, u: int, v: int):
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def dfs(self, node: int, parent: int = -1):
        self.tin[node] = self.timer
        self.tour[self.timer] = self.values[node]
        self.tour_to_node[self.timer] = node
        self.timer += 1
        
        for neighbor in self.adj[node]:
            if neighbor != parent:
                self.dfs(neighbor, node)
        
        self.tout[node] = self.timer - 1
    
    def build(self, root: int = 0):
        self.timer = 0
        self.dfs(root, -1)
    
    def get_subtree_range(self, node: int) -> tuple[int, int]:
        """Returns inclusive range [l, r] for subtree of node."""
        return (self.tin[node], self.tout[node])

For trees given as parent arrays (common in many problems), the construction adapts slightly:

def build_from_parent_array(parent: list[int], values: list[int]) -> EulerTourWithValues:
    n = len(parent)
    et = EulerTourWithValues(n, values)
    root = -1
    
    for i in range(n):
        if parent[i] == -1:
            root = i
        else:
            et.add_edge(i, parent[i])
    
    et.build(root)
    return et

Subtree Queries with Segment Trees

Now for the payoff. With the Euler Tour array, subtree queries become range queries. Pair it with a segment tree for O(log n) operations:

class SegmentTree:
    def __init__(self, arr: list[int]):
        self.n = len(arr)
        self.tree = [0] * (2 * self.n)
        
        # Build leaves
        for i in range(self.n):
            self.tree[self.n + i] = arr[i]
        
        # Build internal nodes
        for i in range(self.n - 1, 0, -1):
            self.tree[i] = self.tree[2 * i] + self.tree[2 * i + 1]
    
    def update(self, pos: int, value: int):
        pos += self.n
        self.tree[pos] = value
        
        while pos > 1:
            pos //= 2
            self.tree[pos] = self.tree[2 * pos] + self.tree[2 * pos + 1]
    
    def query(self, left: int, right: int) -> int:
        """Sum of range [left, right] inclusive."""
        result = 0
        left += self.n
        right += self.n + 1
        
        while left < right:
            if left & 1:
                result += self.tree[left]
                left += 1
            if right & 1:
                right -= 1
                result += self.tree[right]
            left //= 2
            right //= 2
        
        return result


class SubtreeQuerySolver:
    def __init__(self, n: int, values: list[int]):
        self.euler = EulerTourWithValues(n, values)
        self.seg_tree = None
    
    def add_edge(self, u: int, v: int):
        self.euler.add_edge(u, v)
    
    def build(self, root: int = 0):
        self.euler.build(root)
        self.seg_tree = SegmentTree(self.euler.tour)
    
    def subtree_sum(self, node: int) -> int:
        l, r = self.euler.get_subtree_range(node)
        return self.seg_tree.query(l, r)
    
    def update_node(self, node: int, new_value: int):
        pos = self.euler.tin[node]
        self.seg_tree.update(pos, new_value)

This handles subtree sum queries in O(log n) and point updates in O(log n). The same pattern works for min, max, or any associative operation.

Path Queries and LCA Applications

Euler Tour also enables efficient LCA computation. The key insight: in a DFS traversal that records nodes on both entry and exit (or records during traversal), the LCA of two nodes is the shallowest node between their first occurrences.

For LCA, we use a slightly different tour that records nodes multiple times:

class LCAEulerTour:
    def __init__(self, n: int):
        self.n = n
        self.adj = [[] for _ in range(n)]
        self.depth = [0] * n
        self.first_occurrence = [0] * n
        self.tour = []  # (depth, node) pairs
        self.sparse_table = None
    
    def add_edge(self, u: int, v: int):
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def dfs(self, node: int, parent: int, d: int):
        self.depth[node] = d
        self.first_occurrence[node] = len(self.tour)
        self.tour.append((d, node))
        
        for neighbor in self.adj[node]:
            if neighbor != parent:
                self.dfs(neighbor, node, d + 1)
                self.tour.append((d, node))  # Record on backtrack
    
    def build_sparse_table(self):
        m = len(self.tour)
        k = m.bit_length()
        self.sparse_table = [[None] * m for _ in range(k)]
        
        # Initialize with tour values
        for i in range(m):
            self.sparse_table[0][i] = self.tour[i]
        
        # Build sparse table for range minimum
        for j in range(1, k):
            for i in range(m - (1 << j) + 1):
                left = self.sparse_table[j - 1][i]
                right = self.sparse_table[j - 1][i + (1 << (j - 1))]
                self.sparse_table[j][i] = min(left, right)
    
    def build(self, root: int = 0):
        self.dfs(root, -1, 0)
        self.build_sparse_table()
    
    def lca(self, u: int, v: int) -> int:
        l = self.first_occurrence[u]
        r = self.first_occurrence[v]
        if l > r:
            l, r = r, l
        
        length = r - l + 1
        k = length.bit_length() - 1
        
        left_min = self.sparse_table[k][l]
        right_min = self.sparse_table[k][r - (1 << k) + 1]
        
        return min(left_min, right_min)[1]

After O(n log n) preprocessing, LCA queries run in O(1). This is often faster than binary lifting for query-heavy workloads.

Practical Considerations and Optimizations

Space complexity: The basic Euler Tour uses O(n) space for tin/tout arrays. The LCA variant uses O(2n) for the tour (each edge traversed twice) plus O(n log n) for the sparse table. For memory-constrained environments, consider binary lifting instead.

Common pitfalls: Off-by-one errors are rampant. Decide upfront whether your ranges are inclusive or exclusive. I recommend inclusive [tin, tout] ranges—they’re more intuitive for subtree queries.

0-indexing vs 1-indexing: Stick with 0-indexing throughout. Mixing conventions is a bug factory.

Dynamic trees: Euler Tour assumes a static tree. If you need to add/remove edges, look into Link-Cut Trees or Euler Tour Trees (a different data structure that maintains the tour dynamically).

Here’s a memory-efficient version that avoids redundant storage:

class CompactEulerTour:
    __slots__ = ['tin', 'tout', 'tour', 'n']
    
    def __init__(self, adj: list[list[int]], values: list[int], root: int = 0):
        self.n = len(adj)
        self.tin = [0] * self.n
        self.tout = [0] * self.n
        self.tour = [0] * self.n
        
        # Iterative DFS to avoid stack overflow
        timer = 0
        stack = [(root, -1, False)]
        
        while stack:
            node, parent, processed = stack.pop()
            
            if processed:
                self.tout[node] = timer - 1
                continue
            
            self.tin[node] = timer
            self.tour[timer] = values[node]
            timer += 1
            
            stack.append((node, parent, True))
            
            for neighbor in reversed(adj[node]):
                if neighbor != parent:
                    stack.append((neighbor, node, False))

The iterative version handles trees with depth exceeding Python’s recursion limit (typically 1000).

Conclusion

Euler Tour is the go-to technique when you need to answer subtree queries efficiently. It’s simpler than Heavy-Light Decomposition and more general than specialized structures.

Use Euler Tour when:

You need subtree aggregate queries (sum, min, max)
You want O(1) LCA after preprocessing
Your tree is static or changes infrequently

Consider alternatives when:

You need path queries between arbitrary nodes (use HLD)
You’re finding tree centers or solving distance problems (use centroid decomposition)
Your tree changes frequently (use Link-Cut Trees)

For practice, try these problems: SPOJ QTREE, Codeforces 620E (New Year Tree), and LeetCode’s tree query problems. Once you internalize the tin/tout mapping, you’ll see opportunities to apply it everywhere hierarchical data appears.