Cartesian Tree: Min-Heap with BST Properties
A Cartesian tree is a binary tree derived from a sequence of numbers that simultaneously satisfies two properties: it maintains BST ordering based on array indices, and it enforces the min-heap...
Key Insights
- Cartesian trees uniquely combine BST ordering (in-order traversal matches input sequence) with min-heap property (parents always smaller than children), creating exactly one valid tree for any array.
- The linear-time construction algorithm using a monotonic stack is essential knowledge—it transforms an O(n²) naive approach into O(n), making Cartesian trees practical for large datasets.
- The LCA-to-RMQ reduction is the killer application: finding the lowest common ancestor in a Cartesian tree directly answers range minimum queries on the original array, enabling O(1) queries after O(n) preprocessing.
What is a Cartesian Tree?
A Cartesian tree is a binary tree derived from a sequence of numbers that simultaneously satisfies two properties: it maintains BST ordering based on array indices, and it enforces the min-heap property based on values. This dual constraint creates a unique structure—for any given array, exactly one valid Cartesian tree exists.
Consider an array [3, 2, 6, 1, 9]. The minimum element (1 at index 3) becomes the root. Everything to its left in the array forms the left subtree, everything to the right forms the right subtree. This recursive definition continues until the entire array is represented.
The result is a tree where:
- An in-order traversal produces the original sequence
- Every parent node has a smaller value than its children
This isn’t just an academic curiosity. Cartesian trees power efficient range minimum queries, form the backbone of treaps, and appear in computational geometry algorithms. Understanding them unlocks a class of problems that would otherwise require more complex solutions.
Core Properties and Structure
The Cartesian tree maintains two invariants that must hold simultaneously:
BST Property by Position: For any node at position i in the original array, all nodes in its left subtree come from positions < i, and all nodes in its right subtree come from positions > i. This means an in-order traversal reconstructs the original array.
Min-Heap Property by Value: Every node’s value is less than or equal to all values in its subtree. The root always contains the minimum element of the entire array.
For the array [3, 2, 6, 1, 9]:
1(3)
/ \
2(1) 9(4)
/ \
3(0) 6(2)
Each node shows value(index). Notice that 1 is the root (minimum value), and the in-order traversal (3, 2, 6, 1, 9) matches our input.
Here’s the node structure:
class CartesianNode:
def __init__(self, value: int, index: int):
self.value = value
self.index = index
self.left: CartesianNode | None = None
self.right: CartesianNode | None = None
self.parent: CartesianNode | None = None
def __repr__(self):
return f"Node({self.value}, idx={self.index})"
The index field is crucial—it preserves the original position information that defines the BST ordering.
Building a Cartesian Tree from an Array
The naive approach is straightforward: find the minimum element, make it the root, recursively build left and right subtrees from the subarrays. This works but runs in O(n²) worst case (imagine a sorted array—each recursive call scans the remaining elements).
The optimal approach uses a monotonic stack and processes elements left to right in O(n) time. The key insight: when we add a new element, we only need to consider the rightmost path of the tree built so far.
def build_cartesian_tree(arr: list[int]) -> CartesianNode | None:
if not arr:
return None
stack: list[CartesianNode] = []
root: CartesianNode | None = None
for i, value in enumerate(arr):
node = CartesianNode(value, i)
last_popped: CartesianNode | None = None
# Pop nodes with greater values (they become left child of new node)
while stack and stack[-1].value > value:
last_popped = stack.pop()
if last_popped:
node.left = last_popped
last_popped.parent = node
if stack:
# New node becomes right child of stack top
stack[-1].right = node
node.parent = stack[-1]
else:
# New node is the new root
root = node
stack.append(node)
return root
The algorithm maintains a stack representing the rightmost path from root to the most recently added node. When inserting a new element:
- Pop all nodes with greater values (they violate heap property as ancestors)
- The last popped node becomes the new node’s left child
- The new node becomes the right child of whatever remains on top of the stack
- Push the new node onto the stack
Each element is pushed and popped at most once, giving us O(n) time complexity.
Cartesian Trees and Range Minimum Queries
Here’s where Cartesian trees become genuinely powerful. The Lowest Common Ancestor (LCA) of two nodes in a Cartesian tree corresponds exactly to the Range Minimum Query (RMQ) in the original array.
Why? Consider nodes at positions i and j (where i < j). Their LCA is the node with the minimum value in the range [i, j]—it must be an ancestor of both (by the heap property) and must lie between them (by the BST property).
This reduction enables O(1) RMQ queries after O(n) preprocessing:
class CartesianRMQ:
def __init__(self, arr: list[int]):
self.arr = arr
self.n = len(arr)
self.root = build_cartesian_tree(arr)
# Build node index mapping
self.nodes: list[CartesianNode | None] = [None] * self.n
self._map_nodes(self.root)
# Euler tour for LCA
self.euler: list[int] = []
self.depth: list[int] = []
self.first: list[int] = [-1] * self.n
self._euler_tour(self.root, 0)
# Sparse table for range minimum on depths
self._build_sparse_table()
def _map_nodes(self, node: CartesianNode | None):
if node:
self.nodes[node.index] = node
self._map_nodes(node.left)
self._map_nodes(node.right)
def _euler_tour(self, node: CartesianNode | None, d: int):
if not node:
return
if self.first[node.index] == -1:
self.first[node.index] = len(self.euler)
self.euler.append(node.index)
self.depth.append(d)
if node.left:
self._euler_tour(node.left, d + 1)
self.euler.append(node.index)
self.depth.append(d)
if node.right:
self._euler_tour(node.right, d + 1)
self.euler.append(node.index)
self.depth.append(d)
def _build_sparse_table(self):
m = len(self.euler)
if m == 0:
return
k = m.bit_length()
self.sparse = [[0] * m for _ in range(k)]
for i in range(m):
self.sparse[0][i] = i
for j in range(1, k):
for i in range(m - (1 << j) + 1):
left = self.sparse[j-1][i]
right = self.sparse[j-1][i + (1 << (j-1))]
self.sparse[j][i] = left if self.depth[left] <= self.depth[right] else right
def query(self, left: int, right: int) -> int:
"""Return minimum value in arr[left:right+1]"""
if left > right:
left, right = right, left
l = self.first[left]
r = self.first[right]
if l > r:
l, r = r, l
length = r - l + 1
k = length.bit_length() - 1
left_idx = self.sparse[k][l]
right_idx = self.sparse[k][r - (1 << k) + 1]
lca_idx = left_idx if self.depth[left_idx] <= self.depth[right_idx] else right_idx
return self.arr[self.euler[lca_idx]]
The preprocessing builds an Euler tour of the tree and a sparse table for range minimum queries on the depth array. Queries then run in O(1) time.
Treaps: Randomized Cartesian Trees
A treap is a Cartesian tree where priorities are randomly assigned rather than derived from values. Each node has a key (for BST ordering) and a random priority (for heap ordering). This randomization provides expected O(log n) height with high probability.
import random
class TreapNode:
def __init__(self, key: int):
self.key = key
self.priority = random.random()
self.left: TreapNode | None = None
self.right: TreapNode | None = None
def split(root: TreapNode | None, key: int) -> tuple[TreapNode | None, TreapNode | None]:
"""Split tree into nodes < key and nodes >= key"""
if not root:
return None, None
if root.key < key:
left, right = split(root.right, key)
root.right = left
return root, right
else:
left, right = split(root.left, key)
root.left = right
return left, root
def merge(left: TreapNode | None, right: TreapNode | None) -> TreapNode | None:
"""Merge two treaps (all keys in left < all keys in right)"""
if not left:
return right
if not right:
return left
if left.priority > right.priority:
left.right = merge(left.right, right)
return left
else:
right.left = merge(left, right.left)
return right
def insert(root: TreapNode | None, key: int) -> TreapNode:
left, right = split(root, key)
new_node = TreapNode(key)
return merge(merge(left, new_node), right)
def delete(root: TreapNode | None, key: int) -> TreapNode | None:
left, right = split(root, key)
_, right = split(right, key + 1) # Remove the key itself
return merge(left, right)
The split/merge paradigm makes treaps remarkably elegant. Insert splits the tree at the new key, creates a single-node tree, and merges everything back. Delete splits twice to isolate the target node, then merges the remaining pieces.
Practical Applications
Range Minimum Queries: As demonstrated, Cartesian trees provide O(n) preprocessing with O(1) queries. This beats segment trees (O(log n) queries) when you have many queries on a static array.
Sequence Manipulation: Treaps excel at operations like “reverse a subarray” or “move a range.” Augment nodes with subtree sizes and lazy propagation flags for O(log n) operations.
Priority-Based Scheduling: When you need both sorted order and priority access, Cartesian trees provide a natural representation. Task schedulers can efficiently find the highest-priority task within a time window.
Computational Geometry: Cartesian trees appear in algorithms for computing visibility, triangulations, and convex hulls where you need both spatial ordering and value-based selection.
Complexity Summary and When to Use
| Operation | Cartesian Tree | Treap (Expected) |
|---|---|---|
| Construction | O(n) | O(n log n) |
| RMQ Query | O(1)* | N/A |
| Insert | N/A | O(log n) |
| Delete | N/A | O(log n) |
| Split/Merge | N/A | O(log n) |
| Space | O(n) | O(n) |
*After O(n) preprocessing with Euler tour + sparse table
Choose Cartesian trees when: You have a static array and need fast range minimum queries, or you’re implementing algorithms that naturally map to the LCA-RMQ connection.
Choose treaps when: You need a balanced BST with simple implementation, require split/merge operations, or want to avoid the complexity of red-black trees.
Choose segment trees when: You need range updates, not just queries, or when the underlying data changes frequently.
Cartesian trees occupy a specific niche, but within that niche, they’re remarkably effective. Master the stack-based construction and the RMQ reduction, and you’ll have a powerful tool for a class of problems that would otherwise require more complex machinery.