SQL - JSON Functions in SQL

Most modern relational databases support native JSON data types that validate and optimize JSON storage. PostgreSQL, MySQL 8.0+, SQL Server 2016+, and Oracle 12c+ all provide JSON capabilities with...

Key Insights

  • Modern SQL databases provide native JSON functions that eliminate the need for application-level parsing, reducing network overhead and improving query performance by up to 10x for nested data operations
  • JSON path expressions enable direct access to nested properties within stored JSON documents, allowing complex data transformations in a single query without multiple round trips
  • Combining relational and JSON querying patterns creates flexible schemas that support both structured analytics and semi-structured data storage in the same database

JSON Data Types and Storage

Most modern relational databases support native JSON data types that validate and optimize JSON storage. PostgreSQL, MySQL 8.0+, SQL Server 2016+, and Oracle 12c+ all provide JSON capabilities with slightly different syntax.

-- PostgreSQL
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255),
    attributes JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);

-- MySQL
CREATE TABLE products (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255),
    attributes JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

PostgreSQL’s JSONB (binary JSON) offers better query performance than JSON type due to pre-parsed storage format. MySQL’s JSON type automatically validates and stores data in an optimized binary format.

-- Insert JSON data
INSERT INTO products (name, attributes) VALUES
('Laptop', '{"brand": "Dell", "specs": {"ram": "16GB", "storage": "512GB SSD"}, "price": 999.99}'),
('Mouse', '{"brand": "Logitech", "wireless": true, "dpi": 1600, "price": 49.99}');

Extracting JSON Values

JSON extraction operators retrieve values from JSON documents using path expressions. Each database uses different operators but similar concepts.

-- PostgreSQL: -> returns JSON, ->> returns text
SELECT 
    name,
    attributes->>'brand' AS brand,
    attributes->'specs'->>'ram' AS ram,
    (attributes->>'price')::NUMERIC AS price
FROM products;

-- MySQL: JSON_EXTRACT or -> operator
SELECT 
    name,
    JSON_UNQUOTE(JSON_EXTRACT(attributes, '$.brand')) AS brand,
    JSON_EXTRACT(attributes, '$.specs.ram') AS ram,
    JSON_EXTRACT(attributes, '$.price') AS price
FROM products;

-- Shorter MySQL syntax
SELECT 
    name,
    attributes->>'$.brand' AS brand,
    attributes->>'$.specs.ram' AS ram,
    attributes->>'$.price' AS price
FROM products;

The $ symbol represents the root of the JSON document. Dot notation navigates nested objects. PostgreSQL uses -> for JSON output and ->> for text output, while MySQL uses -> for JSON and ->> for unquoted text.

Querying JSON Arrays

JSON arrays require special functions to access elements by index or iterate over all elements.

-- Sample data with arrays
INSERT INTO products (name, attributes) VALUES
('Camera', '{"brand": "Canon", "features": ["4K Video", "WiFi", "Touch Screen"], "price": 799.99}');

-- PostgreSQL: Access array elements (0-indexed)
SELECT 
    name,
    attributes->'features'->0 AS first_feature,
    jsonb_array_length(attributes->'features') AS feature_count
FROM products
WHERE attributes ? 'features';

-- MySQL: Access array elements
SELECT 
    name,
    JSON_EXTRACT(attributes, '$.features[0]') AS first_feature,
    JSON_LENGTH(attributes, '$.features') AS feature_count
FROM products
WHERE JSON_CONTAINS_PATH(attributes, 'one', '$.features');

Expanding arrays into rows enables joining JSON array data with relational tables:

-- PostgreSQL: Expand array to rows
SELECT 
    p.name,
    feature.value AS feature
FROM products p
CROSS JOIN LATERAL jsonb_array_elements_text(p.attributes->'features') AS feature;

-- MySQL: Expand array to rows (8.0+)
SELECT 
    p.name,
    jt.feature
FROM products p
CROSS JOIN JSON_TABLE(
    p.attributes,
    '$.features[*]' COLUMNS(feature VARCHAR(100) PATH '$')
) AS jt;

Filtering and Searching JSON Data

JSON functions in WHERE clauses enable powerful filtering without extracting data to application code.

-- PostgreSQL: Filter by JSON values
SELECT name, attributes
FROM products
WHERE (attributes->>'price')::NUMERIC < 500
  AND attributes->>'brand' = 'Logitech';

-- Check if key exists
SELECT name
FROM products
WHERE attributes ? 'wireless';

-- Check if any array element matches
SELECT name
FROM products
WHERE attributes->'features' @> '["WiFi"]'::jsonb;

-- MySQL: Filter by JSON values
SELECT name, attributes
FROM products
WHERE JSON_EXTRACT(attributes, '$.price') < 500
  AND JSON_EXTRACT(attributes, '$.brand') = 'Logitech';

-- Check if key exists
SELECT name
FROM products
WHERE JSON_CONTAINS_PATH(attributes, 'one', '$.wireless');

-- Search array for value
SELECT name
FROM products
WHERE JSON_CONTAINS(attributes, '"WiFi"', '$.features');

Modifying JSON Data

Update specific JSON properties without replacing the entire document:

-- PostgreSQL: Update nested values
UPDATE products
SET attributes = jsonb_set(
    attributes,
    '{price}',
    '899.99',
    true
)
WHERE name = 'Laptop';

-- Add new key-value pair
UPDATE products
SET attributes = attributes || '{"warranty": "2 years"}'::jsonb
WHERE name = 'Laptop';

-- Remove key
UPDATE products
SET attributes = attributes - 'wireless'
WHERE name = 'Mouse';

-- MySQL: Update nested values
UPDATE products
SET attributes = JSON_SET(
    attributes,
    '$.price',
    899.99
)
WHERE name = 'Laptop';

-- Add new key-value pair
UPDATE products
SET attributes = JSON_INSERT(
    attributes,
    '$.warranty',
    '2 years'
)
WHERE name = 'Laptop';

-- Remove key
UPDATE products
SET attributes = JSON_REMOVE(attributes, '$.wireless')
WHERE name = 'Mouse';

Aggregating JSON Data

Build JSON objects and arrays from query results for API responses:

-- PostgreSQL: Build JSON object from row
SELECT jsonb_build_object(
    'product_name', name,
    'brand', attributes->>'brand',
    'price', attributes->>'price'
) AS product_summary
FROM products;

-- Aggregate rows into JSON array
SELECT jsonb_agg(
    jsonb_build_object(
        'name', name,
        'price', attributes->>'price'
    )
) AS products_list
FROM products
WHERE (attributes->>'price')::NUMERIC < 1000;

-- MySQL: Build JSON object from row
SELECT JSON_OBJECT(
    'product_name', name,
    'brand', JSON_EXTRACT(attributes, '$.brand'),
    'price', JSON_EXTRACT(attributes, '$.price')
) AS product_summary
FROM products;

-- Aggregate rows into JSON array
SELECT JSON_ARRAYAGG(
    JSON_OBJECT(
        'name', name,
        'price', JSON_EXTRACT(attributes, '$.price')
    )
) AS products_list
FROM products
WHERE JSON_EXTRACT(attributes, '$.price') < 1000;

Indexing JSON Data

Indexes on JSON fields dramatically improve query performance for frequently accessed paths:

-- PostgreSQL: GIN index for containment queries
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);

-- Index specific JSON path
CREATE INDEX idx_products_brand ON products ((attributes->>'brand'));

-- MySQL: Generated column with index
ALTER TABLE products 
ADD COLUMN brand VARCHAR(100) AS (attributes->>'$.brand');

CREATE INDEX idx_products_brand ON products(brand);

-- Functional index on JSON path (MySQL 8.0.13+)
CREATE INDEX idx_products_price ON products(
    (CAST(attributes->>'$.price' AS DECIMAL(10,2)))
);

Validating JSON Structure

Enforce JSON schema constraints using check constraints or triggers:

-- PostgreSQL: Check constraint for required keys
ALTER TABLE products
ADD CONSTRAINT check_required_attributes
CHECK (
    attributes ? 'brand' AND
    attributes ? 'price' AND
    (attributes->>'price')::NUMERIC > 0
);

-- MySQL: Check constraint (8.0.16+)
ALTER TABLE products
ADD CONSTRAINT check_required_attributes
CHECK (
    JSON_CONTAINS_PATH(attributes, 'all', '$.brand', '$.price') AND
    JSON_EXTRACT(attributes, '$.price') > 0
);

For complex validation, create stored procedures that validate JSON against application-defined schemas before insertion.

Performance Considerations

JSON functions add computational overhead compared to native column access. Use JSON for semi-structured data that changes frequently or varies between records. Store frequently queried fields as regular columns.

-- Hybrid approach: indexed columns + flexible JSON
CREATE TABLE products_optimized (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    brand VARCHAR(100) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    additional_attributes JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_brand ON products_optimized(brand);
CREATE INDEX idx_price ON products_optimized(price);
CREATE INDEX idx_additional ON products_optimized USING GIN (additional_attributes);

This pattern provides fast queries on common fields while maintaining flexibility for variable attributes. Benchmark your specific workload to determine the optimal balance between relational columns and JSON storage.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.