How to Use CROSSTAB in PostgreSQL

CROSSTAB is PostgreSQL's built-in solution for creating pivot tables—transforming row-based data into a columnar format where unique values from one column become individual columns in the result...

Key Insights

  • CROSSTAB transforms row data into column format for pivot tables, but requires the tablefunc extension and strict adherence to column ordering in your source query
  • The basic CROSSTAB function needs predefined output columns, while the two-parameter variant handles dynamic categories by accepting a second query that defines column names
  • For simple pivots with few categories, CASE statements or FILTER clauses often outperform CROSSTAB and provide more flexibility, but CROSSTAB excels when dealing with many dynamic pivot columns

Introduction to CROSSTAB and the tablefunc Extension

CROSSTAB is PostgreSQL’s built-in solution for creating pivot tables—transforming row-based data into a columnar format where unique values from one column become individual columns in the result set. This is essential for reporting scenarios where you need to compare metrics across categories side-by-side, like monthly sales figures, regional performance data, or product ratings across different dimensions.

Unlike standard SQL aggregations that keep your data in rows, CROSSTAB reshapes your entire result set structure. Instead of seeing multiple rows for each product’s quarterly sales, you get a single row per product with separate columns for Q1, Q2, Q3, and Q4.

The functionality lives in the tablefunc extension, which isn’t enabled by default. You’ll need superuser privileges or appropriate permissions to install it:

CREATE EXTENSION IF NOT EXISTS tablefunc;

This is a one-time operation per database. Once installed, the CROSSTAB function becomes available for all users with access to that database.

Basic CROSSTAB Syntax and Simple Example

The CROSSTAB function expects a very specific input structure from your source query. The query must return at least three columns in this exact order:

  1. Row identifier (what becomes row labels)
  2. Category (what becomes column headers)
  3. Value (what populates the cells)

Here’s the critical part: you must explicitly define the output structure using a column definition list that matches your expected pivot columns.

Let’s pivot quarterly sales data:

-- Sample data setup
CREATE TABLE sales (
    product_name TEXT,
    quarter TEXT,
    revenue NUMERIC
);

INSERT INTO sales VALUES
    ('Widget A', 'Q1', 15000),
    ('Widget A', 'Q2', 18000),
    ('Widget A', 'Q3', 22000),
    ('Widget A', 'Q4', 25000),
    ('Widget B', 'Q1', 12000),
    ('Widget B', 'Q2', 14000),
    ('Widget B', 'Q4', 19000);

-- CROSSTAB query
SELECT * FROM CROSSTAB(
    'SELECT product_name, quarter, revenue 
     FROM sales 
     ORDER BY product_name, quarter',
    'SELECT DISTINCT quarter FROM sales ORDER BY quarter'
) AS ct (
    product_name TEXT,
    q1 NUMERIC,
    q2 NUMERIC,
    q3 NUMERIC,
    q4 NUMERIC
);

Result:

 product_name |  q1   |  q2   |  q3   |  q4
--------------+-------+-------+-------+-------
 Widget A     | 15000 | 18000 | 22000 | 25000
 Widget B     | 12000 | 14000 |       | 19000

Notice Widget B has a NULL for Q3 because no data exists for that quarter. The ORDER BY clauses are crucial—the source query must be sorted by the row identifier first, then the category.

CROSSTAB with Multiple Value Columns

You can pivot multiple metrics simultaneously by including additional value columns in your source query. Each value column you want to pivot requires its own set of output columns.

CREATE TABLE employee_metrics (
    employee_name TEXT,
    quarter TEXT,
    sales_count INTEGER,
    revenue NUMERIC,
    satisfaction_score DECIMAL(3,2)
);

INSERT INTO employee_metrics VALUES
    ('John', 'Q1', 45, 67500, 4.2),
    ('John', 'Q2', 52, 78000, 4.5),
    ('Sarah', 'Q1', 38, 57000, 4.7),
    ('Sarah', 'Q2', 41, 61500, 4.6);

SELECT * FROM CROSSTAB(
    'SELECT employee_name, quarter, sales_count, revenue, satisfaction_score
     FROM employee_metrics
     ORDER BY employee_name, quarter'
) AS ct (
    employee_name TEXT,
    q1_sales INTEGER,
    q1_revenue NUMERIC,
    q1_satisfaction DECIMAL(3,2),
    q2_sales INTEGER,
    q2_revenue NUMERIC,
    q2_satisfaction DECIMAL(3,2)
);

The output structure interleaves the value columns for each category. For two quarters and three metrics, you need six value columns in your output definition, ordered as: Q1_metric1, Q1_metric2, Q1_metric3, Q2_metric1, Q2_metric2, Q2_metric3.

Using CROSSTAB with Dynamic Categories

The examples above hardcode the pivot columns, which fails when categories aren’t known in advance. The two-parameter CROSSTAB variant solves this by accepting a second query that defines the category values:

CREATE TABLE product_ratings (
    product_id INTEGER,
    store_location TEXT,
    avg_rating DECIMAL(3,2)
);

INSERT INTO product_ratings VALUES
    (101, 'New York', 4.5),
    (101, 'Chicago', 4.2),
    (101, 'Seattle', 4.7),
    (102, 'New York', 3.8),
    (102, 'Chicago', 4.1);

-- Two-parameter CROSSTAB
SELECT * FROM CROSSTAB(
    'SELECT product_id, store_location, avg_rating
     FROM product_ratings
     ORDER BY product_id, store_location',
    'SELECT DISTINCT store_location 
     FROM product_ratings 
     ORDER BY store_location'
) AS ct (
    product_id INTEGER,
    chicago DECIMAL(3,2),
    new_york DECIMAL(3,2),
    seattle DECIMAL(3,2)
);

Result:

 product_id | chicago | new_york | seattle
------------+---------+----------+---------
        101 |    4.2  |     4.5  |    4.7
        102 |    4.1  |     3.8  |

The second query determines which columns appear and in what order. You still need to define the output columns manually, but at least the category query can be generated dynamically if needed. The column names in your output definition don’t need to match the actual values—they’re positional.

Advanced Techniques: Aggregations and NULL Handling

Real-world data often requires aggregation before pivoting. Perform your GROUP BY operations in the source query, not within CROSSTAB:

CREATE TABLE orders (
    customer_id INTEGER,
    order_date DATE,
    order_total NUMERIC
);

INSERT INTO orders VALUES
    (1, '2024-01-15', 250),
    (1, '2024-01-20', 180),
    (1, '2024-02-10', 320),
    (2, '2024-01-05', 450),
    (2, '2024-03-12', 290);

SELECT * FROM CROSSTAB(
    'SELECT 
        customer_id,
        TO_CHAR(order_date, ''Month'') as month,
        SUM(order_total) as total
     FROM orders
     WHERE order_date >= ''2024-01-01'' AND order_date < ''2024-04-01''
     GROUP BY customer_id, TO_CHAR(order_date, ''Month'')
     ORDER BY customer_id, month'
) AS ct (
    customer_id INTEGER,
    january NUMERIC,
    february NUMERIC,
    march NUMERIC
);

For cleaner output, use COALESCE in a wrapping query to replace NULLs:

SELECT 
    customer_id,
    COALESCE(january, 0) as january,
    COALESCE(february, 0) as february,
    COALESCE(march, 0) as march
FROM CROSSTAB(
    'SELECT customer_id, TO_CHAR(order_date, ''Month''), SUM(order_total)
     FROM orders
     WHERE order_date >= ''2024-01-01'' AND order_date < ''2024-04-01''
     GROUP BY customer_id, TO_CHAR(order_date, ''Month'')
     ORDER BY customer_id, TO_CHAR(order_date, ''Month'')'
) AS ct (customer_id INTEGER, january NUMERIC, february NUMERIC, march NUMERIC);

Common Pitfalls and Troubleshooting

The most frequent error is column count mismatch. Your output definition must exactly match the number of categories plus one row identifier:

-- WRONG: Only defines 3 output columns but source has 4 quarters
SELECT * FROM CROSSTAB(
    'SELECT product_name, quarter, revenue FROM sales ORDER BY 1,2'
) AS ct (product_name TEXT, q1 NUMERIC, q2 NUMERIC);
-- Error: return and sql tuple descriptions are incompatible

-- CORRECT: Defines all 4 quarters
SELECT * FROM CROSSTAB(
    'SELECT product_name, quarter, revenue FROM sales ORDER BY 1,2'
) AS ct (product_name TEXT, q1 NUMERIC, q2 NUMERIC, q3 NUMERIC, q4 NUMERIC);

Data type mismatches cause silent failures or errors. Ensure your output column types match what the source query returns:

-- WRONG: Defining revenue as INTEGER when it's NUMERIC
AS ct (product_name TEXT, q1 INTEGER, q2 INTEGER)

-- CORRECT: Match the source data type
AS ct (product_name TEXT, q1 NUMERIC, q2 NUMERIC)

Performance degrades with large datasets because CROSSTAB processes the entire result set in memory. For tables with millions of rows, filter aggressively in your source query and consider materialized views for frequently-accessed pivots.

Alternatives and When to Use CROSSTAB

For simple pivots with a known, small number of categories, CASE statements or FILTER clauses are often clearer and more maintainable:

-- Using CASE (works in all PostgreSQL versions)
SELECT 
    product_name,
    SUM(CASE WHEN quarter = 'Q1' THEN revenue ELSE 0 END) as q1,
    SUM(CASE WHEN quarter = 'Q2' THEN revenue ELSE 0 END) as q2,
    SUM(CASE WHEN quarter = 'Q3' THEN revenue ELSE 0 END) as q3,
    SUM(CASE WHEN quarter = 'Q4' THEN revenue ELSE 0 END) as q4
FROM sales
GROUP BY product_name;

-- Using FILTER (PostgreSQL 9.4+, cleaner syntax)
SELECT 
    product_name,
    SUM(revenue) FILTER (WHERE quarter = 'Q1') as q1,
    SUM(revenue) FILTER (WHERE quarter = 'Q2') as q2,
    SUM(revenue) FILTER (WHERE quarter = 'Q3') as q3,
    SUM(revenue) FILTER (WHERE quarter = 'Q4') as q4
FROM sales
GROUP BY product_name;

-- Using CROSSTAB
SELECT * FROM CROSSTAB(
    'SELECT product_name, quarter, revenue FROM sales ORDER BY 1,2'
) AS ct (product_name TEXT, q1 NUMERIC, q2 NUMERIC, q3 NUMERIC, q4 NUMERIC);

Use CROSSTAB when you have many dynamic pivot columns (10+), especially when the categories come from data rather than being hardcoded. The two-parameter variant shines here because you can generate the category query programmatically.

Use CASE/FILTER when you have few, well-known categories and want better query readability. These approaches also integrate better with ORMs and give you more control over NULL handling and aggregation logic.

For application-level reporting, consider whether pivoting belongs in the database at all. Tools like pandas, R, or your BI platform often handle pivoting more flexibly. Let PostgreSQL do what it does best—filtering, joining, and aggregating—and pivot in the presentation layer when it makes architectural sense.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.