SQL - PIVOT and UNPIVOT
PIVOT transforms rows into columns by rotating data around a pivot point. The operation requires three components: an aggregate function, a column to aggregate, and a column whose values become new...
Key Insights
- PIVOT transforms row-level data into columnar aggregates, converting unique values from one column into multiple columns with aggregated results
- UNPIVOT reverses this operation, normalizing wide-format data back into a narrow, row-based structure suitable for relational operations
- Both operations require careful handling of data types, NULL values, and dynamic column lists when working with real-world datasets
Understanding PIVOT Fundamentals
PIVOT transforms rows into columns by rotating data around a pivot point. The operation requires three components: an aggregate function, a column to aggregate, and a column whose values become new column headers.
-- Sample sales data
CREATE TABLE Sales (
SalesYear INT,
Quarter VARCHAR(2),
Region VARCHAR(50),
Amount DECIMAL(10,2)
);
INSERT INTO Sales VALUES
(2023, 'Q1', 'North', 15000),
(2023, 'Q2', 'North', 18000),
(2023, 'Q1', 'South', 12000),
(2023, 'Q2', 'South', 14000),
(2024, 'Q1', 'North', 16000),
(2024, 'Q2', 'North', 19000);
-- Basic PIVOT operation
SELECT *
FROM (
SELECT SalesYear, Quarter, Amount
FROM Sales
) AS SourceTable
PIVOT (
SUM(Amount)
FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable;
The result transforms quarterly data into columns:
SalesYear Q1 Q2 Q3 Q4
2023 27000 32000 NULL NULL
2024 16000 19000 NULL NULL
PIVOT with Multiple Aggregations
Complex scenarios require multiple aggregations across the same pivot structure. This involves concatenating column names with aggregate identifiers.
-- Multiple metrics per quarter
SELECT *
FROM (
SELECT SalesYear, Quarter, Region, Amount
FROM Sales
) AS SourceTable
PIVOT (
SUM(Amount)
FOR Quarter IN ([Q1], [Q2])
) AS PivotTable;
-- For multiple aggregations, use CASE statements
SELECT
SalesYear,
SUM(CASE WHEN Quarter = 'Q1' THEN Amount END) AS Q1_Total,
AVG(CASE WHEN Quarter = 'Q1' THEN Amount END) AS Q1_Avg,
SUM(CASE WHEN Quarter = 'Q2' THEN Amount END) AS Q2_Total,
AVG(CASE WHEN Quarter = 'Q2' THEN Amount END) AS Q2_Avg
FROM Sales
GROUP BY SalesYear;
Dynamic PIVOT for Unknown Column Values
Hard-coding pivot columns fails when column values change. Dynamic SQL constructs PIVOT queries at runtime based on actual data.
-- Dynamic PIVOT using SQL Server
DECLARE @columns NVARCHAR(MAX), @sql NVARCHAR(MAX);
-- Build column list from data
SELECT @columns = STUFF((
SELECT DISTINCT ',' + QUOTENAME(Quarter)
FROM Sales
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 1, '');
-- Construct dynamic query
SET @sql = N'
SELECT *
FROM (
SELECT SalesYear, Quarter, Amount
FROM Sales
) AS SourceTable
PIVOT (
SUM(Amount)
FOR Quarter IN (' + @columns + ')
) AS PivotTable';
EXEC sp_executesql @sql;
For PostgreSQL, use crosstab:
-- Install tablefunc extension
CREATE EXTENSION IF NOT EXISTS tablefunc;
-- Dynamic crosstab
SELECT *
FROM crosstab(
'SELECT SalesYear, Quarter, SUM(Amount)
FROM Sales
GROUP BY SalesYear, Quarter
ORDER BY 1, 2',
'SELECT DISTINCT Quarter FROM Sales ORDER BY 1'
) AS ct(SalesYear INT, Q1 DECIMAL, Q2 DECIMAL, Q3 DECIMAL, Q4 DECIMAL);
UNPIVOT for Data Normalization
UNPIVOT converts columns back into rows, essential for normalizing denormalized reporting tables or preparing data for analysis tools that expect narrow formats.
-- Wide format table
CREATE TABLE QuarterlySales (
SalesYear INT,
Q1 DECIMAL(10,2),
Q2 DECIMAL(10,2),
Q3 DECIMAL(10,2),
Q4 DECIMAL(10,2)
);
INSERT INTO QuarterlySales VALUES
(2023, 27000, 32000, 29000, 35000),
(2024, 16000, 19000, NULL, NULL);
-- UNPIVOT to normalize
SELECT SalesYear, Quarter, Amount
FROM QuarterlySales
UNPIVOT (
Amount FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS UnpivotTable;
Result:
SalesYear Quarter Amount
2023 Q1 27000
2023 Q2 32000
2023 Q3 29000
2023 Q4 35000
2024 Q1 16000
2024 Q2 19000
Note that UNPIVOT excludes NULL values by default. To include them, use CROSS APPLY with VALUES:
SELECT SalesYear, Quarter, Amount
FROM QuarterlySales
CROSS APPLY (
VALUES
('Q1', Q1),
('Q2', Q2),
('Q3', Q3),
('Q4', Q4)
) AS u(Quarter, Amount);
UNPIVOT with Multiple Column Sets
When unpivoting tables with multiple related column sets, maintain relationships between the sets.
CREATE TABLE SalesMetrics (
Region VARCHAR(50),
Q1_Sales DECIMAL(10,2),
Q1_Units INT,
Q2_Sales DECIMAL(10,2),
Q2_Units INT
);
INSERT INTO SalesMetrics VALUES
('North', 15000, 150, 18000, 180),
('South', 12000, 120, 14000, 140);
-- UNPIVOT maintaining relationships
SELECT Region, Quarter, Sales, Units
FROM SalesMetrics
UNPIVOT (
Sales FOR Quarter IN ([Q1_Sales], [Q2_Sales])
) AS SalesUnpivot
UNPIVOT (
Units FOR QuarterUnits IN ([Q1_Units], [Q2_Units])
) AS UnitsUnpivot
WHERE
REPLACE(Quarter, '_Sales', '') = REPLACE(QuarterUnits, '_Units', '');
Alternative approach using CROSS APPLY:
SELECT Region, Quarter, Sales, Units
FROM SalesMetrics
CROSS APPLY (
VALUES
('Q1', Q1_Sales, Q1_Units),
('Q2', Q2_Sales, Q2_Units)
) AS u(Quarter, Sales, Units);
Performance Considerations
PIVOT and UNPIVOT operations can impact query performance. Understanding execution plans helps optimize these operations.
-- Create indexes for PIVOT operations
CREATE INDEX idx_sales_quarter ON Sales(Quarter, SalesYear) INCLUDE (Amount);
-- Analyze execution plan
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Compare PIVOT vs CASE aggregation
-- PIVOT approach
SELECT *
FROM (SELECT SalesYear, Quarter, Amount FROM Sales) AS src
PIVOT (SUM(Amount) FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])) AS pvt;
-- CASE approach
SELECT
SalesYear,
SUM(CASE WHEN Quarter = 'Q1' THEN Amount END) AS Q1,
SUM(CASE WHEN Quarter = 'Q2' THEN Amount END) AS Q2,
SUM(CASE WHEN Quarter = 'Q3' THEN Amount END) AS Q3,
SUM(CASE WHEN Quarter = 'Q4' THEN Amount END) AS Q4
FROM Sales
GROUP BY SalesYear;
The CASE approach often performs better because the optimizer can use indexes more effectively. PIVOT adds an implicit sort operation that may not be necessary.
Cross-Database Compatibility
Different database systems implement pivoting differently. Here’s how to achieve similar results across platforms.
MySQL (using conditional aggregation):
SELECT
SalesYear,
SUM(IF(Quarter = 'Q1', Amount, 0)) AS Q1,
SUM(IF(Quarter = 'Q2', Amount, 0)) AS Q2,
SUM(IF(Quarter = 'Q3', Amount, 0)) AS Q3,
SUM(IF(Quarter = 'Q4', Amount, 0)) AS Q4
FROM Sales
GROUP BY SalesYear;
PostgreSQL (using FILTER):
SELECT
SalesYear,
SUM(Amount) FILTER (WHERE Quarter = 'Q1') AS Q1,
SUM(Amount) FILTER (WHERE Quarter = 'Q2') AS Q2,
SUM(Amount) FILTER (WHERE Quarter = 'Q3') AS Q3,
SUM(Amount) FILTER (WHERE Quarter = 'Q4') AS Q4
FROM Sales
GROUP BY SalesYear;
Handling NULL Values and Data Type Mismatches
PIVOT operations require consistent data types across pivoted columns. NULL handling needs explicit consideration.
-- Handle NULLs with COALESCE
SELECT
SalesYear,
COALESCE(Q1, 0) AS Q1,
COALESCE(Q2, 0) AS Q2,
COALESCE(Q3, 0) AS Q3,
COALESCE(Q4, 0) AS Q4
FROM (
SELECT SalesYear, Quarter, Amount
FROM Sales
) AS SourceTable
PIVOT (
SUM(Amount)
FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable;
-- Ensure data type consistency
SELECT *
FROM (
SELECT
SalesYear,
Quarter,
CAST(Amount AS DECIMAL(15,2)) AS Amount
FROM Sales
) AS SourceTable
PIVOT (
SUM(Amount)
FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable;
PIVOT and UNPIVOT provide powerful tools for reshaping data, but understanding their mechanics, performance implications, and cross-platform alternatives ensures you choose the right approach for each scenario.