How to Use JSON Functions in PostgreSQL

PostgreSQL introduced JSON support in version 9.2 and added the superior JSONB type in 9.4. While both types store JSON data, JSONB stores data in a decomposed binary format that eliminates...

Key Insights

  • PostgreSQL’s JSONB type offers superior performance over JSON through binary storage and indexing capabilities, making it the default choice for production applications storing semi-structured data.
  • The operator syntax (->, ->>, @>, ?) provides intuitive JSON navigation while specialized functions like jsonb_set() and jsonb_array_elements() enable complex manipulations without application-layer processing.
  • GIN indexes on JSONB columns can deliver query performance comparable to traditional indexed columns, but JSON should complement—not replace—proper relational design for core business entities.

Introduction to JSON Support in PostgreSQL

PostgreSQL introduced JSON support in version 9.2 and added the superior JSONB type in 9.4. While both types store JSON data, JSONB stores data in a decomposed binary format that eliminates whitespace and duplicate keys, enabling faster processing and indexing.

The choice between JSON and JSONB is straightforward: use JSONB unless you specifically need to preserve the exact formatting of input JSON (rare). JSONB supports indexing, offers more operators, and processes queries significantly faster. The only trade-off is slightly slower insertion due to conversion overhead—a worthwhile exchange for read-heavy workloads.

Use JSON columns when you need flexibility for semi-structured data like user preferences, API responses, or configuration settings. Don’t use JSON to avoid proper schema design for core entities. A users table should have dedicated columns for email and username, not a single JSON blob.

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    specifications JSONB,  -- Use JSONB for flexible attributes
    metadata JSON          -- Rarely needed; JSONB is almost always better
);

Storing and Retrieving JSON Data

Inserting JSON data is straightforward—pass a valid JSON string or use PostgreSQL’s JSON construction functions. The -> operator returns JSON, while ->> returns text.

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    profile JSONB
);

INSERT INTO users (email, profile) VALUES
    ('alice@example.com', '{"age": 28, "city": "Seattle", "interests": ["hiking", "photography"]}'),
    ('bob@example.com', '{"age": 35, "city": "Portland", "interests": ["cycling"]}'),
    ('carol@example.com', '{"age": 42, "city": "Seattle", "interests": ["reading", "cooking", "hiking"]}');

-- Retrieve JSON field (returns JSONB type)
SELECT email, profile -> 'city' AS city_json FROM users;

-- Retrieve as text (returns TEXT type)
SELECT email, profile ->> 'city' AS city_text FROM users;

-- Extract nested values
SELECT email, profile -> 'interests' ->> 0 AS first_interest FROM users;

For deeply nested data, chain operators or use the #> path operator with an array of keys:

-- Assuming nested structure
INSERT INTO users (email, profile) VALUES
    ('dave@example.com', '{"contact": {"phone": {"mobile": "555-1234"}}}');

-- Chain operators
SELECT profile -> 'contact' -> 'phone' ->> 'mobile' FROM users WHERE email = 'dave@example.com';

-- Path operator (cleaner for deep nesting)
SELECT profile #>> '{contact,phone,mobile}' FROM users WHERE email = 'dave@example.com';

Querying JSON Data with Functions

PostgreSQL provides powerful functions for querying JSON structures. The jsonb_array_elements() function expands arrays into rows, enabling array filtering and analysis.

-- Find users interested in hiking
SELECT email, profile ->> 'city' AS city
FROM users,
     jsonb_array_elements_text(profile -> 'interests') AS interest
WHERE interest = 'hiking';

-- Count interests per user
SELECT email, jsonb_array_length(profile -> 'interests') AS interest_count
FROM users;

The containment operator @> checks if one JSON structure contains another, providing efficient filtering:

-- Find users with specific attributes
SELECT email FROM users WHERE profile @> '{"city": "Seattle"}';

-- Find users with specific interest (checks array containment)
SELECT email FROM users WHERE profile -> 'interests' @> '"hiking"';

The existence operators ?, ?|, and ?& check for key presence:

-- Users with a 'city' key
SELECT email FROM users WHERE profile ? 'city';

-- Users with either 'phone' or 'mobile' keys
SELECT email FROM users WHERE profile ?| array['phone', 'mobile'];

-- Users with both 'age' and 'city' keys
SELECT email FROM users WHERE profile ?& array['age', 'city'];

For converting JSON to relational format, use jsonb_to_record():

SELECT email, p.*
FROM users,
     jsonb_to_record(profile) AS p(age INT, city TEXT)
WHERE email = 'alice@example.com';

Modifying JSON Data

The jsonb_set() function updates values at specific paths. It requires the target path as an array and creates missing intermediate keys by default.

-- Update a user's city
UPDATE users
SET profile = jsonb_set(profile, '{city}', '"Austin"')
WHERE email = 'alice@example.com';

-- Update nested value
UPDATE users
SET profile = jsonb_set(profile, '{contact,phone,mobile}', '"555-9999"')
WHERE email = 'dave@example.com';

-- Add new top-level key
UPDATE users
SET profile = jsonb_set(profile, '{verified}', 'true')
WHERE email = 'bob@example.com';

The concatenation operator || merges JSON objects:

-- Add multiple fields at once
UPDATE users
SET profile = profile || '{"newsletter": true, "verified": true}'
WHERE email = 'carol@example.com';

Remove keys with the - operator:

-- Remove a single key
UPDATE users
SET profile = profile - 'verified'
WHERE email = 'bob@example.com';

-- Remove multiple keys
UPDATE users
SET profile = profile - '{newsletter,verified}'
WHERE email = 'carol@example.com';

For array manipulation, combine operators with jsonb_set():

-- Append to array
UPDATE users
SET profile = jsonb_set(
    profile,
    '{interests}',
    (profile -> 'interests') || '"gaming"'
)
WHERE email = 'bob@example.com';

-- Remove array element by value
UPDATE users
SET profile = jsonb_set(
    profile,
    '{interests}',
    (SELECT jsonb_agg(elem)
     FROM jsonb_array_elements_text(profile -> 'interests') AS elem
     WHERE elem != 'cycling')
)
WHERE email = 'bob@example.com';

Indexing and Performance Optimization

GIN (Generalized Inverted Index) indexes dramatically improve JSONB query performance. Create a GIN index on the entire JSONB column to accelerate containment and existence operators:

CREATE INDEX idx_users_profile ON users USING GIN (profile);

-- This query now uses the index
EXPLAIN ANALYZE
SELECT email FROM users WHERE profile @> '{"city": "Seattle"}';

For queries targeting specific JSON paths, expression indexes provide optimal performance:

-- Index a specific path
CREATE INDEX idx_users_city ON users ((profile ->> 'city'));

-- This query uses the expression index
EXPLAIN ANALYZE
SELECT email FROM users WHERE profile ->> 'city' = 'Seattle';

Performance comparison on a table with 100,000 rows:

-- Without index: Sequential scan, ~45ms
EXPLAIN ANALYZE
SELECT * FROM users WHERE profile @> '{"city": "Seattle"}';

-- With GIN index: Bitmap index scan, ~2ms
CREATE INDEX idx_users_profile_gin ON users USING GIN (profile);
EXPLAIN ANALYZE
SELECT * FROM users WHERE profile @> '{"city": "Seattle"}';

Advanced JSON Techniques

Aggregate relational data into JSON structures using json_agg() and jsonb_object_agg():

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT,
    order_date DATE
);

CREATE TABLE order_items (
    id SERIAL PRIMARY KEY,
    order_id INT,
    product_name VARCHAR(255),
    quantity INT,
    price DECIMAL(10,2)
);

-- Build hierarchical JSON response
SELECT 
    o.id,
    o.order_date,
    jsonb_build_object(
        'items', jsonb_agg(
            jsonb_build_object(
                'product', oi.product_name,
                'quantity', oi.quantity,
                'price', oi.price
            )
        ),
        'total', SUM(oi.quantity * oi.price)
    ) AS order_details
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
WHERE o.user_id = 1
GROUP BY o.id, o.order_date;

Convert entire rows to JSON with row_to_json():

SELECT jsonb_agg(row_to_json(u))
FROM (SELECT id, email, profile FROM users LIMIT 10) u;

Best Practices and Common Pitfalls

Don’t use JSON as a crutch for poor schema design. Core business entities with well-defined attributes belong in columns, not JSON blobs. JSON is for truly flexible, optional, or variable data.

Implement schema validation using CHECK constraints and custom functions:

CREATE OR REPLACE FUNCTION validate_user_profile(profile JSONB)
RETURNS BOOLEAN AS $$
BEGIN
    -- Ensure required keys exist
    IF NOT (profile ? 'age' AND profile ? 'city') THEN
        RETURN FALSE;
    END IF;
    
    -- Validate age is a number
    IF jsonb_typeof(profile -> 'age') != 'number' THEN
        RETURN FALSE;
    END IF;
    
    RETURN TRUE;
END;
$$ LANGUAGE plpgsql;

ALTER TABLE users ADD CONSTRAINT valid_profile
    CHECK (validate_user_profile(profile));

Migrate from JSON to JSONB for existing tables:

-- Add new JSONB column
ALTER TABLE legacy_table ADD COLUMN data_jsonb JSONB;

-- Copy and convert data
UPDATE legacy_table SET data_jsonb = data::JSONB;

-- Swap columns (in transaction)
BEGIN;
ALTER TABLE legacy_table DROP COLUMN data;
ALTER TABLE legacy_table RENAME COLUMN data_jsonb TO data;
COMMIT;

Monitor query performance with EXPLAIN ANALYZE and ensure indexes are being used. If you see sequential scans on large JSONB columns, add appropriate indexes.

PostgreSQL’s JSON capabilities bridge the gap between relational rigor and document flexibility. Use JSONB for semi-structured data, index appropriately, and maintain schema discipline for core entities. This approach delivers both flexibility and performance without sacrificing data integrity.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.