SQL - JSON Functions in SQL
Most modern relational databases support native JSON data types that validate and optimize JSON storage. PostgreSQL, MySQL 8.0+, SQL Server 2016+, and Oracle 12c+ all provide JSON capabilities with...
Key Insights
- Modern SQL databases provide native JSON functions that eliminate the need for application-level parsing, reducing network overhead and improving query performance by up to 10x for nested data operations
- JSON path expressions enable direct access to nested properties within stored JSON documents, allowing complex data transformations in a single query without multiple round trips
- Combining relational and JSON querying patterns creates flexible schemas that support both structured analytics and semi-structured data storage in the same database
JSON Data Types and Storage
Most modern relational databases support native JSON data types that validate and optimize JSON storage. PostgreSQL, MySQL 8.0+, SQL Server 2016+, and Oracle 12c+ all provide JSON capabilities with slightly different syntax.
-- PostgreSQL
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
attributes JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
-- MySQL
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
attributes JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
PostgreSQL’s JSONB (binary JSON) offers better query performance than JSON type due to pre-parsed storage format. MySQL’s JSON type automatically validates and stores data in an optimized binary format.
-- Insert JSON data
INSERT INTO products (name, attributes) VALUES
('Laptop', '{"brand": "Dell", "specs": {"ram": "16GB", "storage": "512GB SSD"}, "price": 999.99}'),
('Mouse', '{"brand": "Logitech", "wireless": true, "dpi": 1600, "price": 49.99}');
Extracting JSON Values
JSON extraction operators retrieve values from JSON documents using path expressions. Each database uses different operators but similar concepts.
-- PostgreSQL: -> returns JSON, ->> returns text
SELECT
name,
attributes->>'brand' AS brand,
attributes->'specs'->>'ram' AS ram,
(attributes->>'price')::NUMERIC AS price
FROM products;
-- MySQL: JSON_EXTRACT or -> operator
SELECT
name,
JSON_UNQUOTE(JSON_EXTRACT(attributes, '$.brand')) AS brand,
JSON_EXTRACT(attributes, '$.specs.ram') AS ram,
JSON_EXTRACT(attributes, '$.price') AS price
FROM products;
-- Shorter MySQL syntax
SELECT
name,
attributes->>'$.brand' AS brand,
attributes->>'$.specs.ram' AS ram,
attributes->>'$.price' AS price
FROM products;
The $ symbol represents the root of the JSON document. Dot notation navigates nested objects. PostgreSQL uses -> for JSON output and ->> for text output, while MySQL uses -> for JSON and ->> for unquoted text.
Querying JSON Arrays
JSON arrays require special functions to access elements by index or iterate over all elements.
-- Sample data with arrays
INSERT INTO products (name, attributes) VALUES
('Camera', '{"brand": "Canon", "features": ["4K Video", "WiFi", "Touch Screen"], "price": 799.99}');
-- PostgreSQL: Access array elements (0-indexed)
SELECT
name,
attributes->'features'->0 AS first_feature,
jsonb_array_length(attributes->'features') AS feature_count
FROM products
WHERE attributes ? 'features';
-- MySQL: Access array elements
SELECT
name,
JSON_EXTRACT(attributes, '$.features[0]') AS first_feature,
JSON_LENGTH(attributes, '$.features') AS feature_count
FROM products
WHERE JSON_CONTAINS_PATH(attributes, 'one', '$.features');
Expanding arrays into rows enables joining JSON array data with relational tables:
-- PostgreSQL: Expand array to rows
SELECT
p.name,
feature.value AS feature
FROM products p
CROSS JOIN LATERAL jsonb_array_elements_text(p.attributes->'features') AS feature;
-- MySQL: Expand array to rows (8.0+)
SELECT
p.name,
jt.feature
FROM products p
CROSS JOIN JSON_TABLE(
p.attributes,
'$.features[*]' COLUMNS(feature VARCHAR(100) PATH '$')
) AS jt;
Filtering and Searching JSON Data
JSON functions in WHERE clauses enable powerful filtering without extracting data to application code.
-- PostgreSQL: Filter by JSON values
SELECT name, attributes
FROM products
WHERE (attributes->>'price')::NUMERIC < 500
AND attributes->>'brand' = 'Logitech';
-- Check if key exists
SELECT name
FROM products
WHERE attributes ? 'wireless';
-- Check if any array element matches
SELECT name
FROM products
WHERE attributes->'features' @> '["WiFi"]'::jsonb;
-- MySQL: Filter by JSON values
SELECT name, attributes
FROM products
WHERE JSON_EXTRACT(attributes, '$.price') < 500
AND JSON_EXTRACT(attributes, '$.brand') = 'Logitech';
-- Check if key exists
SELECT name
FROM products
WHERE JSON_CONTAINS_PATH(attributes, 'one', '$.wireless');
-- Search array for value
SELECT name
FROM products
WHERE JSON_CONTAINS(attributes, '"WiFi"', '$.features');
Modifying JSON Data
Update specific JSON properties without replacing the entire document:
-- PostgreSQL: Update nested values
UPDATE products
SET attributes = jsonb_set(
attributes,
'{price}',
'899.99',
true
)
WHERE name = 'Laptop';
-- Add new key-value pair
UPDATE products
SET attributes = attributes || '{"warranty": "2 years"}'::jsonb
WHERE name = 'Laptop';
-- Remove key
UPDATE products
SET attributes = attributes - 'wireless'
WHERE name = 'Mouse';
-- MySQL: Update nested values
UPDATE products
SET attributes = JSON_SET(
attributes,
'$.price',
899.99
)
WHERE name = 'Laptop';
-- Add new key-value pair
UPDATE products
SET attributes = JSON_INSERT(
attributes,
'$.warranty',
'2 years'
)
WHERE name = 'Laptop';
-- Remove key
UPDATE products
SET attributes = JSON_REMOVE(attributes, '$.wireless')
WHERE name = 'Mouse';
Aggregating JSON Data
Build JSON objects and arrays from query results for API responses:
-- PostgreSQL: Build JSON object from row
SELECT jsonb_build_object(
'product_name', name,
'brand', attributes->>'brand',
'price', attributes->>'price'
) AS product_summary
FROM products;
-- Aggregate rows into JSON array
SELECT jsonb_agg(
jsonb_build_object(
'name', name,
'price', attributes->>'price'
)
) AS products_list
FROM products
WHERE (attributes->>'price')::NUMERIC < 1000;
-- MySQL: Build JSON object from row
SELECT JSON_OBJECT(
'product_name', name,
'brand', JSON_EXTRACT(attributes, '$.brand'),
'price', JSON_EXTRACT(attributes, '$.price')
) AS product_summary
FROM products;
-- Aggregate rows into JSON array
SELECT JSON_ARRAYAGG(
JSON_OBJECT(
'name', name,
'price', JSON_EXTRACT(attributes, '$.price')
)
) AS products_list
FROM products
WHERE JSON_EXTRACT(attributes, '$.price') < 1000;
Indexing JSON Data
Indexes on JSON fields dramatically improve query performance for frequently accessed paths:
-- PostgreSQL: GIN index for containment queries
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
-- Index specific JSON path
CREATE INDEX idx_products_brand ON products ((attributes->>'brand'));
-- MySQL: Generated column with index
ALTER TABLE products
ADD COLUMN brand VARCHAR(100) AS (attributes->>'$.brand');
CREATE INDEX idx_products_brand ON products(brand);
-- Functional index on JSON path (MySQL 8.0.13+)
CREATE INDEX idx_products_price ON products(
(CAST(attributes->>'$.price' AS DECIMAL(10,2)))
);
Validating JSON Structure
Enforce JSON schema constraints using check constraints or triggers:
-- PostgreSQL: Check constraint for required keys
ALTER TABLE products
ADD CONSTRAINT check_required_attributes
CHECK (
attributes ? 'brand' AND
attributes ? 'price' AND
(attributes->>'price')::NUMERIC > 0
);
-- MySQL: Check constraint (8.0.16+)
ALTER TABLE products
ADD CONSTRAINT check_required_attributes
CHECK (
JSON_CONTAINS_PATH(attributes, 'all', '$.brand', '$.price') AND
JSON_EXTRACT(attributes, '$.price') > 0
);
For complex validation, create stored procedures that validate JSON against application-defined schemas before insertion.
Performance Considerations
JSON functions add computational overhead compared to native column access. Use JSON for semi-structured data that changes frequently or varies between records. Store frequently queried fields as regular columns.
-- Hybrid approach: indexed columns + flexible JSON
CREATE TABLE products_optimized (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
brand VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
additional_attributes JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_brand ON products_optimized(brand);
CREATE INDEX idx_price ON products_optimized(price);
CREATE INDEX idx_additional ON products_optimized USING GIN (additional_attributes);
This pattern provides fast queries on common fields while maintaining flexibility for variable attributes. Benchmark your specific workload to determine the optimal balance between relational columns and JSON storage.