2D Segment Tree: Matrix Range Queries

Key Insights

2D segment trees use a “tree of trees” architecture where each node in the outer segment tree contains a complete inner segment tree, enabling O(log n × log m) queries on rectangular submatrices.
Building a 2D segment tree requires O(n × m) time and O(16 × n × m) space in the worst case—a significant overhead that’s justified only when query frequency is high relative to matrix size.
For most practical applications involving only point updates and range queries, 2D Fenwick trees offer better cache performance; reserve 2D segment trees for problems requiring lazy propagation or complex merge operations.

The Problem with Naive Matrix Queries

Consider a game engine tracking damage values across a 1000×1000 tile map. Players frequently query rectangular regions to calculate area-of-effect damage totals. With naive iteration, each query costs O(n × m) operations—up to a million iterations per query. When you’re processing thousands of queries per frame, this approach collapses.

Prefix sums solve this for static matrices with O(1) queries after O(n × m) preprocessing. But what happens when the matrix changes? Each update invalidates the prefix sum array, requiring O(n × m) recomputation. We need a structure that handles both updates and queries efficiently.

Enter the 2D segment tree: O(log n × log m) for both operations.

From 1D to 2D: The Tree of Trees

A 1D segment tree partitions an array into hierarchical intervals. Each node represents a range and stores an aggregate value (sum, min, max). Queries decompose into O(log n) node accesses.

The 2D extension isn’t a single tree—it’s a tree of trees. The outer segment tree partitions rows. But instead of storing simple values, each outer node contains a complete inner segment tree that partitions columns for that row range.

Here’s the conceptual structure:

struct SegmentTree2D {
    int rows, cols;
    // tree[i] represents outer node i, containing an inner tree for columns
    // Outer tree has ~4*rows nodes, each inner tree has ~4*cols nodes
    vector<vector<long long>> tree;
    
    SegmentTree2D(int n, int m) : rows(n), cols(m) {
        // Allocate outer tree nodes, each containing an inner tree
        tree.assign(4 * rows, vector<long long>(4 * cols, 0));
    }
};

When you query a rectangle [r1, c1] to [r2, c2], the outer tree identifies O(log n) nodes covering rows [r1, r2]. For each of these nodes, the inner tree queries columns [c1, c2] in O(log m) time. Total: O(log n × log m).

Building the 2D Segment Tree

Construction proceeds bottom-up in both dimensions. We first build inner trees for leaf nodes of the outer tree (individual rows), then merge them upward.

class SegmentTree2D {
private:
    int rows, cols;
    vector<vector<long long>> tree;
    vector<vector<int>> matrix;

    void buildY(int vx, int lx, int rx, int vy, int ly, int ry) {
        if (ly == ry) {
            if (lx == rx) {
                // Leaf in both dimensions: store matrix value
                tree[vx][vy] = matrix[lx][ly];
            } else {
                // Merge from children in outer tree
                tree[vx][vy] = tree[2*vx][vy] + tree[2*vx+1][vy];
            }
        } else {
            int mid = (ly + ry) / 2;
            buildY(vx, lx, rx, 2*vy, ly, mid);
            buildY(vx, lx, rx, 2*vy+1, mid+1, ry);
            tree[vx][vy] = tree[vx][2*vy] + tree[vx][2*vy+1];
        }
    }

    void buildX(int vx, int lx, int rx) {
        if (lx == rx) {
            // Leaf node in outer tree: build inner tree from single row
            buildY(vx, lx, rx, 1, 0, cols - 1);
        } else {
            int mid = (lx + rx) / 2;
            buildX(2*vx, lx, mid);
            buildX(2*vx+1, mid+1, rx);
            // Build inner tree by merging children's inner trees
            buildY(vx, lx, rx, 1, 0, cols - 1);
        }
    }

public:
    SegmentTree2D(vector<vector<int>>& mat) : matrix(mat) {
        rows = mat.size();
        cols = mat[0].size();
        tree.assign(4 * rows, vector<long long>(4 * cols, 0));
        buildX(1, 0, rows - 1);
    }
};

The build complexity is O(n × m × log n). Each of the O(n × m) matrix cells contributes to O(log n) outer nodes, and building each inner tree node is O(1) after its children exist. Space consumption is O(16 × n × m) in the worst case—four times the rows, four times the columns.

Range Query Implementation

Querying decomposes the problem: the outer tree handles row decomposition, and for each relevant outer node, we query its inner tree for the column range.

long long queryY(int vx, int vy, int tly, int try_, int ly, int ry) {
    if (ly > try_ || ry < tly) {
        return 0;  // Out of range
    }
    if (ly <= tly && try_ <= ry) {
        return tree[vx][vy];  // Fully contained
    }
    int mid = (tly + try_) / 2;
    return queryY(vx, 2*vy, tly, mid, ly, ry) +
           queryY(vx, 2*vy+1, mid+1, try_, ly, ry);
}

long long queryX(int vx, int tlx, int trx, int lx, int rx, int ly, int ry) {
    if (lx > trx || rx < tlx) {
        return 0;  // Row range doesn't intersect
    }
    if (lx <= tlx && trx <= rx) {
        // This outer node's row range is fully contained
        // Query its inner tree for column range
        return queryY(vx, 1, 0, cols - 1, ly, ry);
    }
    int mid = (tlx + trx) / 2;
    return queryX(2*vx, tlx, mid, lx, rx, ly, ry) +
           queryX(2*vx+1, mid+1, trx, lx, rx, ly, ry);
}

// Public interface: query rectangle [r1,c1] to [r2,c2]
long long query(int r1, int c1, int r2, int c2) {
    return queryX(1, 0, rows - 1, r1, r2, c1, c2);
}

The outer query identifies O(log n) nodes whose row ranges are fully contained in [r1, r2]. Each triggers an inner query costing O(log m). Total: O(log n × log m).

Point Update Operations

Updating cell (x, y) requires modifying all outer nodes whose ranges include row x, and within each, updating all inner nodes whose ranges include column y.

void updateY(int vx, int lx, int rx, int vy, int ly, int ry, int x, int y, int val) {
    if (ly == ry) {
        if (lx == rx) {
            tree[vx][vy] = val;  // Leaf: direct assignment
        } else {
            // Merge from children outer nodes at this column position
            tree[vx][vy] = tree[2*vx][vy] + tree[2*vx+1][vy];
        }
    } else {
        int mid = (ly + ry) / 2;
        if (y <= mid) {
            updateY(vx, lx, rx, 2*vy, ly, mid, x, y, val);
        } else {
            updateY(vx, lx, rx, 2*vy+1, mid+1, ry, x, y, val);
        }
        tree[vx][vy] = tree[vx][2*vy] + tree[vx][2*vy+1];
    }
}

void updateX(int vx, int lx, int rx, int x, int y, int val) {
    if (lx == rx) {
        // Leaf in outer tree: update inner tree directly
        updateY(vx, lx, rx, 1, 0, cols - 1, x, y, val);
    } else {
        int mid = (lx + rx) / 2;
        if (x <= mid) {
            updateX(2*vx, lx, mid, x, y, val);
        } else {
            updateX(2*vx+1, mid+1, rx, x, y, val);
        }
        // Rebuild this inner tree position by merging children
        updateY(vx, lx, rx, 1, 0, cols - 1, x, y, val);
    }
}

void update(int x, int y, int val) {
    matrix[x][y] = val;
    updateX(1, 0, rows - 1, x, y, val);
}

The update touches O(log n) outer nodes. For each, it modifies O(log m) inner nodes. Complexity: O(log n × log m).

Handling Range Updates with Lazy Propagation

Range updates in 2D segment trees introduce significant complexity. Lazy propagation requires storing pending updates at each node and pushing them down during queries.

// Pseudocode skeleton for 2D lazy propagation
struct Node2D {
    long long sum;
    long long lazyAdd;  // Pending addition for entire subtree
};

void pushDownY(int vx, int vy, int ly, int ry) {
    if (tree[vx][vy].lazyAdd != 0) {
        int mid = (ly + ry) / 2;
        // Apply lazy value to children
        tree[vx][2*vy].lazyAdd += tree[vx][vy].lazyAdd;
        tree[vx][2*vy].sum += tree[vx][vy].lazyAdd * (mid - ly + 1);
        tree[vx][2*vy+1].lazyAdd += tree[vx][vy].lazyAdd;
        tree[vx][2*vy+1].sum += tree[vx][vy].lazyAdd * (ry - mid);
        tree[vx][vy].lazyAdd = 0;
    }
}

// Similar pushDown needed for outer tree dimension
// Range update requires careful coordination between both levels

My recommendation: avoid 2D lazy propagation unless absolutely necessary. The implementation complexity and debugging difficulty rarely justify the benefit. Consider problem reformulation or alternative data structures first.

Performance Analysis and When to Use What

2D Segment Tree vs. Alternatives:

Operation	2D Segment Tree	2D Fenwick Tree	Prefix Sums
Build	O(nm log n)	O(nm log n log m)	O(nm)
Point Update	O(log n log m)	O(log n log m)	O(nm)
Range Query	O(log n log m)	O(log n log m)	O(1)
Range Update	O(log n log m)*	Not supported	O(nm)
Space	O(16nm)	O(nm)	O(nm)

*With lazy propagation

Use 2D segment trees when:

You need range updates with lazy propagation
Your merge operation is complex (not just sum/min/max)
Query patterns don’t align with Fenwick tree’s prefix-based approach

Use 2D Fenwick trees when:

Only point updates and range sum queries are needed
Memory is constrained
Cache performance matters (Fenwick trees have better locality)

Use prefix sums when:

The matrix is static or rarely updated
Query frequency vastly exceeds update frequency

For image processing pipelines computing regional statistics, game engines with dynamic tile maps, or GIS systems aggregating spatial data, 2D segment trees provide the flexibility to handle complex operations. But for simpler sum queries with point updates, Fenwick trees win on practical performance despite identical asymptotic complexity.