Sparse Table: Static Range Minimum Query

Key Insights

Sparse tables exploit the idempotent property of minimum operations—overlapping ranges don’t affect correctness, enabling O(1) queries with O(n log n) preprocessing
The core insight is that any range can be covered by exactly two power-of-2 sized ranges, and taking the minimum of overlapping answers is still correct
Choose sparse tables over segment trees when your data is static; the constant-factor performance improvement and simpler query logic often matter in competitive programming and performance-critical applications

The Range Minimum Query Problem

Range Minimum Query (RMQ) is deceptively simple: given an array and two indices, return the minimum value between them. This operation appears everywhere—from finding lowest common ancestors in trees to solving computational geometry problems to building suffix array data structures.

The “static” constraint means our array never changes after construction. No insertions, no deletions, no updates. This restriction might seem limiting, but it unlocks a powerful optimization that dynamic data structures can’t match.

Consider these real-world applications: computing LCA in O(1) after Euler tour reduction, finding the closest pair of points in divide-and-conquer algorithms, or answering thousands of minimum queries over sensor readings that arrive in batches. When updates are rare but queries are frequent, sparse tables shine.

Naive Approaches and Their Limitations

The brute force solution is obvious: iterate through the range and track the minimum.

int queryNaive(const vector<int>& arr, int left, int right) {
    int result = arr[left];
    for (int i = left + 1; i <= right; i++) {
        result = min(result, arr[i]);
    }
    return result;
}

This runs in O(n) per query. For Q queries, that’s O(nQ) total—unacceptable when both n and Q reach into the millions.

The opposite extreme precomputes every possible answer:

// Precompute all O(n²) range minimums
vector<vector<int>> precompute(const vector<int>& arr) {
    int n = arr.size();
    vector<vector<int>> table(n, vector<int>(n));
    
    for (int i = 0; i < n; i++) {
        table[i][i] = arr[i];
        for (int j = i + 1; j < n; j++) {
            table[i][j] = min(table[i][j-1], arr[j]);
        }
    }
    return table;
}

Now queries are O(1), but we’ve used O(n²) space and preprocessing time. For n = 10⁶, that’s a terabyte of memory. Not practical.

We need a middle ground: sublinear query time with reasonable space. Sparse tables deliver O(1) queries with only O(n log n) space.

The Sparse Table Intuition

The breakthrough insight comes from a property of the minimum operation: it’s idempotent. Taking the minimum of overlapping ranges gives the same result as the minimum of the union. Formally, min(min(A), min(B)) = min(A ∪ B), regardless of overlap.

This is not true for sum. If ranges overlap, you’d count elements twice. But for min, max, gcd, bitwise AND, and bitwise OR, overlapping doesn’t matter.

Here’s the key observation: any range [L, R] of length k can be covered by two ranges of length 2^⌊log₂(k)⌋. One starts at L, the other ends at R. They overlap in the middle, but we don’t care—we’re computing minimum.

Instead of precomputing all n² ranges, we only precompute ranges whose lengths are powers of 2. There are only log n distinct power-of-2 lengths, and n starting positions for each. That’s O(n log n) ranges total.

Building the Sparse Table

We define sparse[i][j] as the minimum value in the range starting at index i with length 2^j. The range is [i, i + 2^j - 1].

The base case is trivial: sparse[i][0] = arr[i] (ranges of length 1).

The recurrence exploits a beautiful property: a range of length 2^j can be split into two adjacent ranges of length 2^(j-1).

sparse[i][j] = min(sparse[i][j-1], sparse[i + 2^(j-1)][j-1])

The first half covers [i, i + 2^(j-1) - 1], the second covers [i + 2^(j-1), i + 2^j - 1].

void buildSparseTable(const vector<int>& arr, 
                      vector<vector<int>>& sparse,
                      int n, int logN) {
    // Base case: ranges of length 1
    for (int i = 0; i < n; i++) {
        sparse[i][0] = arr[i];
    }
    
    // Fill table using DP
    for (int j = 1; j <= logN; j++) {
        for (int i = 0; i + (1 << j) <= n; i++) {
            sparse[i][j] = min(
                sparse[i][j-1],
                sparse[i + (1 << (j-1))][j-1]
            );
        }
    }
}

The inner loop condition i + (1 << j) <= n ensures we don’t read past the array bounds. We process increasing values of j so that smaller ranges are computed before larger ones need them.

Answering Queries in O(1)

Given a query range [L, R], we compute the length len = R - L + 1 and find k = ⌊log₂(len)⌋. Then we query two overlapping ranges:

sparse[L][k] — the range [L, L + 2^k - 1]
sparse[R - 2^k + 1][k] — the range [R - 2^k + 1, R]

These two ranges together cover [L, R] completely (with overlap in the middle), and their minimum is our answer.

Computing floor-log for every query would add overhead. Instead, we precompute a log table:

void precomputeLog(vector<int>& logTable, int n) {
    logTable[1] = 0;
    for (int i = 2; i <= n; i++) {
        logTable[i] = logTable[i / 2] + 1;
    }
}

int query(const vector<vector<int>>& sparse,
          const vector<int>& logTable,
          int left, int right) {
    int len = right - left + 1;
    int k = logTable[len];
    return min(
        sparse[left][k],
        sparse[right - (1 << k) + 1][k]
    );
}

The log table uses the recurrence log₂(n) = log₂(n/2) + 1. After O(n) preprocessing, every log lookup is O(1).

Complete Implementation

Here’s a production-ready sparse table implementation:

class SparseTable {
private:
    vector<vector<int>> sparse;
    vector<int> logTable;
    int n;
    
public:
    SparseTable(const vector<int>& arr) {
        n = arr.size();
        int logN = 32 - __builtin_clz(n); // ceil(log2(n)) + 1
        
        // Precompute log values
        logTable.resize(n + 1);
        logTable[1] = 0;
        for (int i = 2; i <= n; i++) {
            logTable[i] = logTable[i / 2] + 1;
        }
        
        // Build sparse table
        sparse.assign(n, vector<int>(logN));
        
        for (int i = 0; i < n; i++) {
            sparse[i][0] = arr[i];
        }
        
        for (int j = 1; j < logN; j++) {
            for (int i = 0; i + (1 << j) <= n; i++) {
                sparse[i][j] = min(
                    sparse[i][j-1],
                    sparse[i + (1 << (j-1))][j-1]
                );
            }
        }
    }
    
    int query(int left, int right) {
        int len = right - left + 1;
        int k = logTable[len];
        return min(
            sparse[left][k],
            sparse[right - (1 << k) + 1][k]
        );
    }
};

Complexity Analysis:

Space: O(n log n) for the sparse table, O(n) for the log table
Preprocessing: O(n log n) to fill the sparse table
Query: O(1) — two table lookups and one comparison

The constant factors are excellent. No recursion, no branching beyond the min comparison, and cache-friendly access patterns during construction.

When to Use Sparse Table

Choose sparse tables when:

Your data is static (no updates after construction)
You need the absolute fastest query time
You’re working with idempotent operations (min, max, gcd, lcm, bitwise AND/OR)
Memory isn’t severely constrained (O(n log n) is acceptable)

Choose segment trees instead when:

You need point updates or range updates
You’re computing non-idempotent operations like sum or product
You need lazy propagation for range modifications

Extensions worth knowing:

2D Sparse Tables: Extend to matrices for 2D RMQ in O(1) with O(n² log² n) preprocessing
Index Queries: Store indices instead of values when you need the position of the minimum
Other Operations: GCD sparse tables are common in number theory problems

The non-idempotent trap: Don’t try to use sparse tables for range sum queries. The overlapping ranges will double-count elements. For sums, use prefix sums (O(1) query, O(n) space) or segment trees if updates are needed.

Sparse tables represent a beautiful tradeoff in algorithm design. By accepting slightly more preprocessing time and space, we achieve the theoretical optimum for query complexity. When your problem fits the static RMQ pattern, reach for sparse tables first—they’re simpler to implement than segment trees and faster in practice.