Segment Tree with Lazy Propagation
Range query problems appear everywhere in competitive programming and production systems alike. You might need to find the sum of elements in a subarray, locate the minimum value in a range, or...
Key Insights
- Lazy propagation defers range updates by storing pending operations at internal nodes, pushing them down only when necessary—transforming O(n) range updates into O(log n) operations.
- The key invariant is simple: before accessing any node’s children, always propagate pending updates downward first. Violating this invariant is the most common source of bugs.
- Segment trees with lazy propagation excel when you need both range updates and range queries; for point-only updates, simpler structures like Fenwick trees often suffice.
Introduction & Motivation
Range query problems appear everywhere in competitive programming and production systems alike. You might need to find the sum of elements in a subarray, locate the minimum value in a range, or update entire segments of data at once. Database engines, time-series analytics, and game physics systems all face these challenges.
Naive approaches crumble under scale. Precomputing prefix sums gives you O(1) range queries, but updates become O(n). Storing raw arrays means O(n) queries. When your application demands both fast queries and fast updates—especially range updates that modify thousands of elements simultaneously—you need a more sophisticated data structure.
The segment tree solves the query-update tradeoff beautifully, but its standard form still requires O(n) time for range updates. Lazy propagation fixes this, giving you O(log n) for both operations. Let’s build it from the ground up.
Segment Tree Refresher
A segment tree is a binary tree where each node represents a contiguous segment of the array. The root covers the entire array, and each node’s children split its range in half. Leaves represent individual elements.
This structure enables O(log n) queries by combining at most O(log n) precomputed segment results. Point updates propagate up the tree, updating O(log n) ancestor nodes.
class SegmentTree:
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n) # Safe upper bound
self._build(arr, 1, 0, self.n - 1)
def _build(self, arr, node, start, end):
if start == end:
self.tree[node] = arr[start]
return
mid = (start + end) // 2
self._build(arr, 2 * node, start, mid)
self._build(arr, 2 * node + 1, mid + 1, end)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def point_update(self, idx, val, node=1, start=0, end=None):
if end is None:
end = self.n - 1
if start == end:
self.tree[node] = val
return
mid = (start + end) // 2
if idx <= mid:
self.point_update(idx, val, 2 * node, start, mid)
else:
self.point_update(idx, val, 2 * node + 1, mid + 1, end)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def query(self, l, r, node=1, start=0, end=None):
if end is None:
end = self.n - 1
if r < start or l > end: # No overlap
return 0
if l <= start and end <= r: # Complete overlap
return self.tree[node]
mid = (start + end) // 2
left_sum = self.query(l, r, 2 * node, start, mid)
right_sum = self.query(l, r, 2 * node + 1, mid + 1, end)
return left_sum + right_sum
This baseline gives us O(log n) point updates and O(log n) range queries. But what happens when we need to update an entire range?
The Problem with Range Updates
Suppose you need to add a value to every element in a range [l, r]. With the basic segment tree, you’re forced to perform individual point updates:
def naive_range_update(self, l, r, delta):
"""Add delta to all elements in range [l, r]."""
for i in range(l, r + 1):
self.point_update(i, self.tree[self._get_leaf_index(i)] + delta)
Each point update costs O(log n), and you’re doing (r - l + 1) of them. For a range spanning the entire array, that’s O(n log n) per update. Run a thousand range updates on a million-element array, and you’re looking at billions of operations.
The fundamental issue: we’re doing work we don’t need to do yet. If a query never touches certain elements, why bother updating them immediately? This insight leads directly to lazy propagation.
Lazy Propagation Concept
Lazy propagation embodies strategic procrastination. Instead of immediately applying a range update to all affected nodes, we mark the update as “pending” at the highest possible nodes and defer the actual work until absolutely necessary.
Think of it like a manager delegating tasks. When a range update arrives, the segment tree says: “I’ll note that this entire segment needs updating, but I won’t bother my children with the details unless someone actually asks about them.”
The mechanism requires two components:
- Lazy array: Parallel to the tree array, storing pending updates for each node.
- Push-down operation: Before accessing a node’s children, propagate any pending updates from parent to children.
The invariant is critical: a node’s value is always correct for its segment, accounting for any pending updates stored at that node. When we push down, we transfer responsibility to the children while clearing the parent’s pending state.
Visually, imagine pending updates as sticky notes attached to internal nodes. Queries and updates that need to descend past a node first check for sticky notes, apply them to children, and remove the note before proceeding.
Implementation Deep Dive
Here’s a complete implementation supporting range addition and range sum queries:
class LazySegmentTree:
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n)
self.lazy = [0] * (4 * self.n) # Pending additions
self._build(arr, 1, 0, self.n - 1)
def _build(self, arr, node, start, end):
if start == end:
self.tree[node] = arr[start]
return
mid = (start + end) // 2
self._build(arr, 2 * node, start, mid)
self._build(arr, 2 * node + 1, mid + 1, end)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def _push_down(self, node, start, end):
"""Propagate pending updates to children."""
if self.lazy[node] == 0:
return
mid = (start + end) // 2
left_child = 2 * node
right_child = 2 * node + 1
# Apply pending update to left child
self.tree[left_child] += self.lazy[node] * (mid - start + 1)
self.lazy[left_child] += self.lazy[node]
# Apply pending update to right child
self.tree[right_child] += self.lazy[node] * (end - mid)
self.lazy[right_child] += self.lazy[node]
# Clear parent's pending update
self.lazy[node] = 0
def range_update(self, l, r, delta, node=1, start=0, end=None):
"""Add delta to all elements in range [l, r]."""
if end is None:
end = self.n - 1
if r < start or l > end: # No overlap
return
if l <= start and end <= r: # Complete overlap
# Apply update to this node and mark as lazy
self.tree[node] += delta * (end - start + 1)
self.lazy[node] += delta
return
# Partial overlap: must descend
self._push_down(node, start, end) # Clear pending before descending
mid = (start + end) // 2
self.range_update(l, r, delta, 2 * node, start, mid)
self.range_update(l, r, delta, 2 * node + 1, mid + 1, end)
# Recalculate current node from children
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
def query(self, l, r, node=1, start=0, end=None):
"""Return sum of elements in range [l, r]."""
if end is None:
end = self.n - 1
if r < start or l > end: # No overlap
return 0
if l <= start and end <= r: # Complete overlap
return self.tree[node]
# Partial overlap: must descend
self._push_down(node, start, end) # Clear pending before descending
mid = (start + end) // 2
left_sum = self.query(l, r, 2 * node, start, mid)
right_sum = self.query(l, r, 2 * node + 1, mid + 1, end)
return left_sum + right_sum
Notice the pattern: both range_update and query call _push_down before accessing children. This maintains our invariant. The update multiplies delta by segment length because we’re tracking sums—each element in the segment receives the delta.
Common Variations & Extensions
Range Assignment replaces addition with setting all elements to a specific value. The lazy value represents the assigned value, and we need a sentinel to distinguish “no pending update” from “assign zero”:
class LazySegmentTreeAssign:
NO_UPDATE = float('inf') # Sentinel value
def __init__(self, arr):
self.n = len(arr)
self.tree = [0] * (4 * self.n)
self.lazy = [self.NO_UPDATE] * (4 * self.n)
self._build(arr, 1, 0, self.n - 1)
def _push_down(self, node, start, end):
if self.lazy[node] == self.NO_UPDATE:
return
mid = (start + end) // 2
left_child, right_child = 2 * node, 2 * node + 1
# Assignment overwrites previous values entirely
self.tree[left_child] = self.lazy[node] * (mid - start + 1)
self.lazy[left_child] = self.lazy[node]
self.tree[right_child] = self.lazy[node] * (end - mid)
self.lazy[right_child] = self.lazy[node]
self.lazy[node] = self.NO_UPDATE
def range_assign(self, l, r, val, node=1, start=0, end=None):
if end is None:
end = self.n - 1
if r < start or l > end:
return
if l <= start and end <= r:
self.tree[node] = val * (end - start + 1)
self.lazy[node] = val
return
self._push_down(node, start, end)
mid = (start + end) // 2
self.range_assign(l, r, val, 2 * node, start, mid)
self.range_assign(l, r, val, 2 * node + 1, mid + 1, end)
self.tree[node] = self.tree[2 * node] + self.tree[2 * node + 1]
Range Min/Max Queries work similarly, but the tree stores minimums/maximums instead of sums. For range addition with min queries, adding delta to a segment simply adds delta to that segment’s minimum.
Combining Operations gets tricky. If you need both range assignment and range addition, you must define how they interact. Typically, assignment clears any pending addition. The lazy state becomes a pair (assigned_value, pending_addition), with careful ordering in push-down.
Complexity Analysis & Practical Tips
Time Complexity: Both range updates and range queries run in O(log n). Each operation visits at most O(log n) nodes, and push-down at each node is O(1).
Space Complexity: O(n) for both tree and lazy arrays. The 4n multiplier is a safe upper bound; 2n suffices for power-of-two sizes.
Common Pitfalls:
- Forgetting to push down: The most frequent bug. Always propagate before accessing children, in both updates and queries.
- Integer overflow: Range sums can explode. Use 64-bit integers for large arrays or large update values.
- Incorrect segment length calculation: Off-by-one errors in (end - start + 1) versus (end - mid) are subtle and deadly.
- Mixing update types without proper handling: Assignment and addition don’t commute. Define clear semantics.
When to Use Alternatives:
- Fenwick trees: Simpler and faster for point updates with prefix queries. No range update support without tricks.
- Sqrt decomposition: Easier to implement, O(√n) operations. Good enough for many practical scenarios.
- Sparse segment trees: When the index space is huge but updates are sparse.
Lazy propagation shines when you genuinely need both range updates and range queries with O(log n) guarantees. For simpler requirements, reach for simpler tools. But when the problem demands it, this technique delivers performance that naive approaches simply cannot match.