SQL - Array/UNNEST Operations (PostgreSQL)
PostgreSQL supports native array types for any data type, storing multiple values in a single column. Arrays maintain insertion order and allow duplicates, making them suitable for ordered...
Key Insights
- PostgreSQL’s array operations and UNNEST function transform complex data manipulation tasks that would require multiple queries or application-level processing into single, efficient SQL statements
- UNNEST converts arrays into table rows, enabling JOIN operations, aggregations, and set-based logic directly within queries while maintaining ACID guarantees
- Combining array functions with CTEs, window functions, and lateral joins creates powerful patterns for hierarchical data, many-to-many relationships, and bulk operations
Understanding PostgreSQL Arrays
PostgreSQL supports native array types for any data type, storing multiple values in a single column. Arrays maintain insertion order and allow duplicates, making them suitable for ordered collections and denormalized data structures.
-- Create table with array columns
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
tags TEXT[],
prices NUMERIC[],
related_ids INTEGER[]
);
-- Insert data with array literals
INSERT INTO products (name, tags, prices, related_ids) VALUES
('Laptop', ARRAY['electronics', 'computers', 'portable'],
ARRAY[999.99, 1299.99, 1499.99], ARRAY[2, 3]),
('Mouse', ARRAY['electronics', 'accessories'],
ARRAY[29.99, 49.99], ARRAY[1]),
('Desk', ARRAY['furniture', 'office'],
ARRAY[299.99], ARRAY[4]);
-- Array construction alternatives
INSERT INTO products (name, tags, prices) VALUES
('Chair', '{"furniture", "office", "ergonomic"}', '{199.99, 249.99}');
Array indexing starts at 1 (not 0), and you can access elements using bracket notation or the array_length function for bounds checking.
-- Access array elements
SELECT
name,
tags[1] as first_tag,
tags[array_length(tags, 1)] as last_tag,
array_length(tags, 1) as tag_count
FROM products;
UNNEST Fundamentals
UNNEST transforms an array into a set of rows, one per array element. This operation is crucial for set-based operations on array data.
-- Basic UNNEST
SELECT unnest(ARRAY['apple', 'banana', 'cherry']) as fruit;
-- Returns 3 rows
-- UNNEST with table data
SELECT
p.name,
unnest(p.tags) as tag
FROM products p;
-- UNNEST with ordinality (position tracking)
SELECT
p.name,
tag,
position
FROM products p
CROSS JOIN LATERAL unnest(p.tags) WITH ORDINALITY as t(tag, position);
The WITH ORDINALITY clause adds a position column, preserving array order information after expansion.
Filtering and Aggregating Array Data
UNNEST enables filtering on array elements and aggregating results back into arrays or scalar values.
-- Find products with specific tags
SELECT DISTINCT name
FROM products p
CROSS JOIN unnest(p.tags) as tag
WHERE tag = 'electronics';
-- Count products per tag
SELECT
tag,
COUNT(*) as product_count
FROM products p
CROSS JOIN unnest(p.tags) as tag
GROUP BY tag
ORDER BY product_count DESC;
-- Aggregate back to arrays
SELECT
ARRAY_AGG(DISTINCT tag ORDER BY tag) as all_unique_tags
FROM products p
CROSS JOIN unnest(p.tags) as tag;
-- Filter and reconstruct arrays
SELECT
p.name,
ARRAY_AGG(price ORDER BY price) as filtered_prices
FROM products p
CROSS JOIN unnest(p.prices) as price
WHERE price < 500
GROUP BY p.id, p.name;
Multiple Array UNNEST Operations
When working with multiple arrays simultaneously, PostgreSQL requires careful handling to maintain row correspondence.
-- Parallel UNNEST (wrong approach - creates Cartesian product)
SELECT
p.name,
tag,
price
FROM products p
CROSS JOIN unnest(p.tags) as tag
CROSS JOIN unnest(p.prices) as price; -- Multiplies rows incorrectly
-- Correct parallel UNNEST using single UNNEST call
SELECT
p.name,
tag,
price,
related_id
FROM products p
CROSS JOIN unnest(p.tags, p.prices, p.related_ids)
as t(tag, price, related_id);
-- Handle arrays of different lengths with ordinality
SELECT
p.name,
t1.tag,
t1.pos,
t2.price,
t2.pos
FROM products p
CROSS JOIN LATERAL unnest(p.tags) WITH ORDINALITY as t1(tag, pos)
FULL OUTER JOIN LATERAL unnest(p.prices) WITH ORDINALITY as t2(price, pos)
ON t1.pos = t2.pos;
Array Operations and Transformations
PostgreSQL provides extensive array manipulation functions that combine powerfully with UNNEST.
-- Array operators
SELECT
tags,
'electronics' = ANY(tags) as has_electronics,
tags @> ARRAY['furniture'] as contains_furniture,
tags && ARRAY['office', 'portable'] as overlaps_office_or_portable
FROM products;
-- Array modification
SELECT
name,
tags,
array_append(tags, 'featured') as with_featured,
array_prepend('new', tags) as with_new,
array_remove(tags, 'electronics') as without_electronics,
tags || ARRAY['sale'] as concatenated
FROM products;
-- Transform array elements
SELECT
name,
ARRAY_AGG(upper(tag) ORDER BY tag) as uppercase_tags
FROM products
CROSS JOIN unnest(tags) as tag
GROUP BY id, name;
-- Conditional array building
SELECT
name,
ARRAY_AGG(price) FILTER (WHERE price > 100) as expensive_prices,
ARRAY_AGG(price ORDER BY price DESC) as sorted_prices
FROM products
CROSS JOIN unnest(prices) as price
GROUP BY id, name;
Advanced Patterns: Many-to-Many Relationships
Arrays with UNNEST can simplify many-to-many relationships without junction tables, particularly for read-heavy workloads.
-- Create related tables
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER,
product_ids INTEGER[],
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO orders (customer_id, product_ids) VALUES
(101, ARRAY[1, 2]),
(102, ARRAY[1, 3]),
(103, ARRAY[2, 3, 4]);
-- Join through array expansion
SELECT
o.id as order_id,
o.customer_id,
p.name as product_name,
p.tags
FROM orders o
CROSS JOIN unnest(o.product_ids) as product_id
JOIN products p ON p.id = product_id;
-- Aggregate analysis across relationships
SELECT
p.name,
COUNT(DISTINCT o.customer_id) as unique_customers,
COUNT(o.id) as times_ordered
FROM orders o
CROSS JOIN unnest(o.product_ids) as product_id
JOIN products p ON p.id = product_id
GROUP BY p.id, p.name
ORDER BY times_ordered DESC;
Performance Considerations and Indexing
Arrays and UNNEST operations have specific performance characteristics that differ from normalized designs.
-- GIN index for array containment queries
CREATE INDEX idx_products_tags ON products USING GIN(tags);
-- Fast containment queries
EXPLAIN ANALYZE
SELECT name FROM products
WHERE tags @> ARRAY['electronics'];
-- GIN index for element search
CREATE INDEX idx_products_tags_elements ON products
USING GIN(tags array_ops);
-- Efficient ANY queries
SELECT name FROM products
WHERE 'furniture' = ANY(tags);
-- Avoid UNNEST in WHERE clauses when possible
-- Slow:
SELECT name FROM products
WHERE EXISTS (
SELECT 1 FROM unnest(tags) t WHERE t LIKE 'elect%'
);
-- Faster with array functions:
SELECT name FROM products
WHERE EXISTS (
SELECT 1 FROM unnest(tags) t WHERE t LIKE 'elect%'
)
-- Better yet, use array operators when applicable
OR tags::text LIKE '%elect%';
Practical Use Case: Tag-Based Search
Combining arrays with full-text search and relevance ranking creates powerful search functionality.
-- Create search-optimized view
CREATE MATERIALIZED VIEW product_search AS
SELECT
p.id,
p.name,
p.tags,
to_tsvector('english',
p.name || ' ' || array_to_string(p.tags, ' ')
) as search_vector
FROM products p;
CREATE INDEX idx_product_search ON product_search USING GIN(search_vector);
-- Search with tag boosting
SELECT
ps.name,
ps.tags,
ts_rank(ps.search_vector, query) as rank
FROM product_search ps,
to_tsquery('english', 'electronic | computer') query
WHERE ps.search_vector @@ query
OR ps.tags && ARRAY['electronics', 'computers']
ORDER BY rank DESC;
Arrays and UNNEST operations in PostgreSQL provide a middle ground between fully normalized schemas and document databases. They excel in scenarios with ordered collections, variable-length attributes, and read-heavy workloads where denormalization improves query performance. Understanding when to use arrays versus traditional normalization depends on access patterns, update frequency, and query complexity requirements.