How to Use GROUP BY ROLLUP in PostgreSQL

Key Insights

ROLLUP generates hierarchical subtotals and grand totals in a single query, eliminating the need for UNION ALL statements to combine multiple aggregation levels
The GROUPING() function distinguishes between NULL values from actual data versus NULL values generated by ROLLUP aggregation, enabling proper labeling of total rows
ROLLUP processes columns right-to-left, creating a hierarchy where each level removes the rightmost column from the grouping, making column order critical to results

Introduction to ROLLUP

When building reports that require subtotals and grand totals, you typically face two options: write multiple GROUP BY queries and combine them with UNION ALL, or perform aggregation in application code. Both approaches are tedious and error-prone. PostgreSQL’s ROLLUP extension to GROUP BY solves this by generating multiple grouping levels in a single query.

ROLLUP creates a hierarchy of aggregations, starting with the most granular grouping and progressively removing columns from right to left until it produces a grand total. This is exactly what you need for financial reports, sales dashboards, and any scenario where you want to show data at multiple aggregation levels simultaneously.

Let’s start with a sample sales table that we’ll use throughout this article:

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    sale_date DATE,
    region VARCHAR(50),
    category VARCHAR(50),
    product VARCHAR(100),
    quantity INTEGER,
    amount DECIMAL(10, 2)
);

INSERT INTO sales (sale_date, region, category, product, quantity, amount) VALUES
    ('2024-01-15', 'North', 'Electronics', 'Laptop', 5, 5000.00),
    ('2024-01-20', 'North', 'Electronics', 'Phone', 10, 8000.00),
    ('2024-01-25', 'North', 'Furniture', 'Desk', 3, 900.00),
    ('2024-02-10', 'South', 'Electronics', 'Laptop', 8, 8000.00),
    ('2024-02-15', 'South', 'Electronics', 'Tablet', 12, 6000.00),
    ('2024-02-20', 'South', 'Furniture', 'Chair', 20, 2000.00);

Basic ROLLUP Syntax

The syntax for ROLLUP is straightforward—you simply add ROLLUP() around the columns in your GROUP BY clause:

SELECT 
    region,
    SUM(amount) as total_sales,
    COUNT(*) as num_transactions
FROM sales
GROUP BY ROLLUP(region)
ORDER BY region NULLS LAST;

This query produces three types of rows:

Sales grouped by each region
A grand total row where region is NULL

The output looks like this:

 region | total_sales | num_transactions
--------+-------------+------------------
 North  |    13900.00 |                3
 South  |    16000.00 |                3
 NULL   |    29900.00 |                6

The NULL row represents the grand total across all regions. This single query replaces what would otherwise require two separate queries combined with UNION ALL.

Multi-Column ROLLUP

ROLLUP’s real power emerges when you use multiple columns. It creates a hierarchy by progressively removing columns from right to left:

SELECT 
    region,
    category,
    product,
    SUM(amount) as total_sales
FROM sales
GROUP BY ROLLUP(region, category, product)
ORDER BY region NULLS LAST, category NULLS LAST, product NULLS LAST;

This generates multiple aggregation levels:

region + category + product (most granular)
region + category (product becomes NULL)
region (category and product become NULL)
Grand total (all columns become NULL)

The output shows the hierarchy clearly:

 region |  category   | product | total_sales
--------+-------------+---------+-------------
 North  | Electronics | Laptop  |     5000.00
 North  | Electronics | Phone   |     8000.00
 North  | Electronics | NULL    |    13000.00
 North  | Furniture   | Desk    |      900.00
 North  | Furniture   | NULL    |      900.00
 North  | NULL        | NULL    |    13900.00
 South  | Electronics | Laptop  |     8000.00
 South  | Electronics | Tablet  |     6000.00
 South  | Electronics | NULL    |    14000.00
 South  | Furniture   | Chair   |     2000.00
 South  | Furniture   | NULL    |     2000.00
 South  | NULL        | NULL    |    16000.00
 NULL   | NULL        | NULL    |    29900.00

Notice how ROLLUP creates subtotals for each category within each region, then subtotals for each region, and finally a grand total. The column order matters—changing it changes the hierarchy.

Identifying Grouping Levels with GROUPING()

The NULL values in ROLLUP output create ambiguity: is NULL an actual data value or a placeholder for “all values”? The GROUPING() function solves this by returning 1 when a column is aggregated away by ROLLUP, and 0 when it’s part of the grouping:

SELECT 
    CASE 
        WHEN GROUPING(region) = 1 THEN 'GRAND TOTAL'
        ELSE region 
    END as region,
    CASE 
        WHEN GROUPING(category) = 1 AND GROUPING(region) = 0 THEN 'Region Total'
        WHEN GROUPING(category) = 0 THEN category
        ELSE NULL
    END as category,
    product,
    SUM(amount) as total_sales,
    GROUPING(region) as region_grouped,
    GROUPING(category) as category_grouped,
    GROUPING(product) as product_grouped
FROM sales
GROUP BY ROLLUP(region, category, product)
ORDER BY region NULLS LAST, category NULLS LAST, product NULLS LAST;

This query adds descriptive labels and shows the GROUPING() values for each column. The GROUPING() function is essential for creating readable reports where you want to label subtotal and total rows differently from detail rows.

You can also use GROUPING() in WHERE clauses to filter for specific aggregation levels:

-- Only show region-level subtotals
SELECT 
    region,
    SUM(amount) as total_sales
FROM sales
GROUP BY ROLLUP(region, category)
HAVING GROUPING(region) = 0 AND GROUPING(category) = 1;

Practical Use Cases

ROLLUP excels in reporting scenarios. Here’s a complete quarterly sales report that demonstrates real-world usage:

SELECT 
    CASE 
        WHEN GROUPING(quarter) = 1 THEN 'TOTAL'
        ELSE 'Q' || quarter::TEXT
    END as period,
    CASE 
        WHEN GROUPING(region) = 1 AND GROUPING(quarter) = 0 THEN 'All Regions'
        WHEN GROUPING(region) = 1 THEN ''
        ELSE region
    END as region,
    CASE 
        WHEN GROUPING(category) = 1 AND GROUPING(region) = 0 THEN 'All Categories'
        WHEN GROUPING(category) = 1 THEN ''
        ELSE category
    END as category,
    TO_CHAR(SUM(amount), '$999,999.99') as total_sales,
    SUM(quantity) as units_sold,
    TO_CHAR(AVG(amount), '$999,999.99') as avg_transaction
FROM (
    SELECT 
        EXTRACT(QUARTER FROM sale_date) as quarter,
        region,
        category,
        amount,
        quantity
    FROM sales
) subquery
GROUP BY ROLLUP(quarter, region, category)
ORDER BY quarter NULLS LAST, region NULLS LAST, category NULLS LAST;

This query produces a professional report with properly labeled totals, formatted currency, and multiple aggregation levels. It’s exactly the kind of output you’d want for a dashboard or executive summary.

ROLLUP vs CUBE vs GROUPING SETS

PostgreSQL offers three GROUP BY extensions, each serving different purposes:

ROLLUP: Creates hierarchical subtotals (right-to-left)
CUBE: Creates all possible combinations of groupings
GROUPING SETS: Lets you specify exactly which groupings you want

Here’s a comparison:

-- ROLLUP: Creates hierarchy
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY ROLLUP(region, category);
-- Produces: (region, category), (region), ()

-- CUBE: All combinations
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY CUBE(region, category);
-- Produces: (region, category), (region), (category), ()

-- GROUPING SETS: Explicit control
SELECT region, category, SUM(amount) as total
FROM sales
GROUP BY GROUPING SETS ((region, category), (region), ());
-- Produces: Only the specified groupings

Use ROLLUP when you have a natural hierarchy (time periods, geographic regions, organizational structure). Use CUBE when you need to analyze data from all possible angles. Use GROUPING SETS when you want specific combinations without the full CUBE overhead.

Performance Considerations and Best Practices

ROLLUP is generally more efficient than multiple queries with UNION ALL because PostgreSQL can compute all aggregation levels in a single pass. However, you should still consider indexing:

-- Create indexes on ROLLUP columns
CREATE INDEX idx_sales_region_category_product 
ON sales(region, category, product);

-- Include aggregated columns for covering indexes
CREATE INDEX idx_sales_rollup_covering 
ON sales(region, category, product) 
INCLUDE (amount, quantity);

Compare the performance:

-- ROLLUP approach
EXPLAIN ANALYZE
SELECT region, category, SUM(amount)
FROM sales
GROUP BY ROLLUP(region, category);

-- Traditional approach with UNION ALL
EXPLAIN ANALYZE
SELECT region, category, SUM(amount)
FROM sales
GROUP BY region, category
UNION ALL
SELECT region, NULL, SUM(amount)
FROM sales
GROUP BY region
UNION ALL
SELECT NULL, NULL, SUM(amount)
FROM sales;

In most cases, ROLLUP shows better performance because it scans the table once rather than multiple times. The query planner can also optimize ROLLUP queries more effectively.

Key best practices:

Order ROLLUP columns from least granular to most granular (left to right) to match your reporting hierarchy
Always use GROUPING() to distinguish NULL values in production queries
Consider materialized views for frequently-run ROLLUP queries on large datasets
Use partial ROLLUP (e.g., GROUP BY region, ROLLUP(category, product)) when you don’t need all hierarchy levels
Add appropriate indexes on ROLLUP columns in the same order

ROLLUP transforms complex reporting queries into simple, maintainable SQL. Once you understand its hierarchical nature and the GROUPING() function, you’ll find countless opportunities to simplify your aggregation logic.