How to Use ARRAY Types in PostgreSQL

Key Insights

PostgreSQL arrays let you store multiple values in a single column, eliminating joins for simple lists like tags or phone numbers, but they sacrifice normalization and should be used strategically.
Array operators like @> (contains), && (overlaps), and ANY make querying array data intuitive, while GIN indexes provide excellent performance for array containment searches.
Functions like unnest() and array_agg() bridge the gap between array and relational paradigms, letting you expand arrays into rows or aggregate rows into arrays as needed.

Introduction to PostgreSQL Arrays

PostgreSQL supports native array types, allowing you to store multiple values of the same data type in a single column. Unlike most relational databases that force you to create junction tables for one-to-many relationships, PostgreSQL gives you the option to denormalize data when it makes sense.

Arrays work best for simple lists where you don’t need to query individual elements frequently or maintain referential integrity. Think tags on blog posts, phone numbers for a contact, or product categories. If you need complex queries on the related data, foreign key constraints, or the list items have their own attributes, stick with normalized tables.

PostgreSQL supports both single-dimensional and multi-dimensional arrays. Single-dimensional arrays are by far the most common. Multi-dimensional arrays exist but are rarely needed in practice—if you’re considering them, you probably want JSONB instead.

Here’s the simplest possible example:

CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT,
    tags TEXT[]
);

INSERT INTO articles (title, tags) 
VALUES ('PostgreSQL Arrays', ARRAY['postgresql', 'database', 'sql']);

Creating and Defining Array Columns

Array column definitions use bracket notation after the data type. You can create arrays of any built-in or custom type:

CREATE TABLE contacts (
    id SERIAL PRIMARY KEY,
    name TEXT,
    phone_numbers TEXT[],
    lucky_numbers INTEGER[],
    metadata JSONB[]
);

-- Arrays of custom types
CREATE TYPE mood AS ENUM ('happy', 'sad', 'neutral');

CREATE TABLE daily_logs (
    id SERIAL PRIMARY KEY,
    log_date DATE,
    hourly_moods mood[]
);

You can specify array dimensions in the type definition, but PostgreSQL doesn’t enforce them—INTEGER[3] and INTEGER[] behave identically. The dimension specification is purely documentation.

Setting default values for arrays is straightforward:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT,
    categories TEXT[] DEFAULT ARRAY[]::TEXT[],
    ratings INTEGER[] DEFAULT '{5,5,5}'
);

PostgreSQL doesn’t support CHECK constraints that reference array length directly, but you can use a function:

CREATE TABLE limited_tags (
    id SERIAL PRIMARY KEY,
    tags TEXT[] CHECK (array_length(tags, 1) <= 5)
);

Inserting and Updating Array Data

PostgreSQL provides multiple syntaxes for array literals. The most readable is the ARRAY constructor:

INSERT INTO contacts (name, phone_numbers, lucky_numbers)
VALUES (
    'John Doe',
    ARRAY['555-0100', '555-0101'],
    ARRAY[7, 13, 42]
);

You can also use the curly brace syntax, which is more compact but requires proper escaping:

INSERT INTO contacts (name, phone_numbers, lucky_numbers)
VALUES (
    'Jane Smith',
    '{"555-0200", "555-0201"}',
    '{3, 9, 27}'
);

For text arrays containing special characters, you need to escape quotes and backslashes:

INSERT INTO articles (title, tags)
VALUES (
    'Escaping in Arrays',
    '{"tag with spaces", "tag\"with\"quotes"}'
);

Updating arrays works like any other column, but you can also update individual elements by index (PostgreSQL arrays are 1-indexed, not 0-indexed):

-- Replace entire array
UPDATE contacts 
SET phone_numbers = ARRAY['555-9999']
WHERE id = 1;

-- Update specific element
UPDATE contacts 
SET phone_numbers[1] = '555-8888'
WHERE id = 1;

-- Append to array
UPDATE contacts 
SET phone_numbers = array_append(phone_numbers, '555-7777')
WHERE id = 1;

Querying Array Data

Array querying is where PostgreSQL really shines. You can access individual elements, search for values, and check for containment or overlap.

Access elements by index:

-- Get first phone number (arrays are 1-indexed!)
SELECT name, phone_numbers[1] AS primary_phone
FROM contacts;

-- Get array slice
SELECT name, phone_numbers[1:2] AS first_two_phones
FROM contacts;

The ANY operator checks if a value matches any array element:

-- Find contacts with a specific phone number
SELECT * FROM contacts
WHERE '555-0100' = ANY(phone_numbers);

-- Find articles with 'postgresql' tag
SELECT * FROM articles
WHERE 'postgresql' = ANY(tags);

The containment operator @> checks if an array contains all elements of another array:

-- Find articles tagged with both 'postgresql' AND 'database'
SELECT * FROM articles
WHERE tags @> ARRAY['postgresql', 'database'];

The reverse containment operator <@ checks if an array is contained by another:

-- Find articles with only approved tags
SELECT * FROM articles
WHERE tags <@ ARRAY['postgresql', 'database', 'sql', 'performance'];

The overlap operator && checks if arrays have any common elements:

-- Find articles sharing at least one tag
SELECT * FROM articles
WHERE tags && ARRAY['postgresql', 'mongodb'];

Array Functions and Operations

PostgreSQL provides rich array manipulation functions. Here are the essential ones:

-- Get array length
SELECT array_length(tags, 1) FROM articles;

-- Append and prepend elements
SELECT array_append(ARRAY[1,2,3], 4);  -- {1,2,3,4}
SELECT array_prepend(0, ARRAY[1,2,3]); -- {0,1,2,3}

-- Concatenate arrays
SELECT ARRAY[1,2] || ARRAY[3,4];       -- {1,2,3,4}

-- Remove elements
SELECT array_remove(ARRAY[1,2,3,2], 2); -- {1,3}

The unnest() function expands an array into a set of rows, which is invaluable for joining array data with other tables:

-- Expand tags into individual rows
SELECT id, title, unnest(tags) AS tag
FROM articles;

-- Count tag usage across all articles
SELECT tag, COUNT(*) AS usage_count
FROM (
    SELECT unnest(tags) AS tag FROM articles
) AS expanded_tags
GROUP BY tag
ORDER BY usage_count DESC;

The reverse operation is array_agg(), which aggregates rows into an array:

-- Aggregate all tags into a single array
SELECT array_agg(DISTINCT tag) AS all_tags
FROM (SELECT unnest(tags) AS tag FROM articles) AS t;

-- Group phone numbers by area code
SELECT 
    substring(phone FROM 1 FOR 3) AS area_code,
    array_agg(phone) AS numbers
FROM (SELECT unnest(phone_numbers) AS phone FROM contacts) AS t
GROUP BY area_code;

Indexing Arrays for Performance

Without indexes, array containment queries require sequential scans. GIN (Generalized Inverted Index) indexes solve this:

CREATE INDEX idx_articles_tags ON articles USING GIN(tags);

GIN indexes dramatically improve performance for @>, <@, &&, and = ANY operations:

-- This query uses the GIN index
EXPLAIN ANALYZE
SELECT * FROM articles 
WHERE tags @> ARRAY['postgresql'];

For simple equality checks on the entire array, use a standard B-tree index:

CREATE INDEX idx_articles_tags_btree ON articles(tags);

-- B-tree index works here
SELECT * FROM articles WHERE tags = ARRAY['postgresql', 'database'];

GIN indexes are larger than B-tree indexes and slower to update, but the query performance gain for array searches is worth it. If you’re frequently querying array contents, create the GIN index.

Common Patterns and Best Practices

Use arrays for simple, bounded lists. Tags, categories, phone numbers, and email addresses are good candidates. Each item is atomic, the list is relatively short (under 100 elements), and you don’t need to query attributes of individual items.

Avoid arrays when you need referential integrity. If array elements should reference other tables, use a junction table instead:

-- DON'T do this
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    product_ids INTEGER[]  -- No foreign key constraint possible
);

-- DO this instead
CREATE TABLE orders (
    id SERIAL PRIMARY KEY
);

CREATE TABLE order_items (
    order_id INTEGER REFERENCES orders(id),
    product_id INTEGER REFERENCES products(id),
    PRIMARY KEY (order_id, product_id)
);

Real-world example: A tagging system

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT,
    tags TEXT[] DEFAULT ARRAY[]::TEXT[],
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_posts_tags ON posts USING GIN(tags);

-- Insert posts with tags
INSERT INTO posts (title, content, tags)
VALUES 
    ('PostgreSQL Arrays', 'Learn about arrays...', ARRAY['postgresql', 'tutorial']),
    ('Database Indexing', 'Indexes improve...', ARRAY['postgresql', 'performance']);

-- Find posts by tag
SELECT * FROM posts WHERE tags @> ARRAY['postgresql'];

-- Find posts with any of multiple tags
SELECT * FROM posts WHERE tags && ARRAY['tutorial', 'performance'];

-- Get tag popularity
SELECT tag, COUNT(*) AS post_count
FROM (SELECT unnest(tags) AS tag FROM posts) AS t
GROUP BY tag
ORDER BY post_count DESC;

-- Add tag to existing post
UPDATE posts 
SET tags = array_append(tags, 'database')
WHERE id = 1;

-- Remove tag from post
UPDATE posts 
SET tags = array_remove(tags, 'tutorial')
WHERE id = 1;

Arrays aren’t a silver bullet. They trade normalization for convenience. Use them when the convenience wins—when you need simple lists without complex querying requirements. When you need relationships, constraints, or attributes on list items, use proper normalized tables. PostgreSQL gives you both tools; choose the right one for each job.