SQL - Recursive CTE with Examples

A Common Table Expression (CTE) is a temporary named result set that exists only for the duration of a single query. Think of it as a disposable view that makes complex queries readable and...

Key Insights

  • Recursive CTEs follow a simple pattern: an anchor query that starts the recursion, UNION ALL, and a recursive query that references itself—master this structure and you can traverse any hierarchical data.
  • The most practical applications are organizational hierarchies, bill of materials explosions, and generating date/number sequences without relying on database-specific functions.
  • Always set MAXRECURSION limits and include termination conditions in your recursive member to prevent runaway queries that consume server resources.

What Are Common Table Expressions?

A Common Table Expression (CTE) is a temporary named result set that exists only for the duration of a single query. Think of it as a disposable view that makes complex queries readable and maintainable.

Before diving into recursion, let’s establish the basic syntax. Here’s a subquery approach versus a CTE for the same logic:

-- Subquery approach: harder to read, especially when nested
SELECT department_name, avg_salary
FROM (
    SELECT department_id, AVG(salary) as avg_salary
    FROM employees
    GROUP BY department_id
) dept_salaries
JOIN departments d ON d.id = dept_salaries.department_id
WHERE avg_salary > 75000;

-- CTE approach: logic flows top to bottom
WITH dept_salaries AS (
    SELECT department_id, AVG(salary) as avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT department_name, avg_salary
FROM dept_salaries
JOIN departments d ON d.id = dept_salaries.department_id
WHERE avg_salary > 75000;

The CTE version reads like a story: first calculate department salaries, then join and filter. This becomes crucial when you need to reference the same derived table multiple times or when building recursive queries.

Anatomy of a Recursive CTE

A recursive CTE has three components that execute in a specific order:

  1. Anchor member: The initial query that provides the base result set (no self-reference)
  2. UNION ALL: Combines anchor results with recursive results
  3. Recursive member: References the CTE itself, processing rows from the previous iteration

The database executes the anchor first, then repeatedly executes the recursive member using the previous iteration’s results until no new rows are returned.

-- Generate numbers 1 through 10
WITH RECURSIVE counter AS (
    -- Anchor: start at 1
    SELECT 1 AS n
    
    UNION ALL
    
    -- Recursive: add 1 to previous value
    SELECT n + 1
    FROM counter
    WHERE n < 10  -- Termination condition
)
SELECT n FROM counter;

Note: SQL Server omits the RECURSIVE keyword. PostgreSQL and MySQL 8+ require it.

Here’s the execution flow:

Iteration Rows Produced n values
1 (anchor) 1 1
2 1 2
3 1 3
10 1 10
11 0 (none, recursion stops)

The WHERE n < 10 condition is critical. Without it, the query runs indefinitely until the database kills it or hits a recursion limit.

Hierarchical Data: Employee-Manager Trees

The classic recursive CTE use case is traversing organizational hierarchies. Given a table where each employee references their manager’s ID, you can build the complete reporting chain.

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT REFERENCES employees(id),
    title VARCHAR(100)
);

-- Sample data
INSERT INTO employees VALUES
(1, 'Sarah Chen', NULL, 'CEO'),
(2, 'Marcus Johnson', 1, 'VP Engineering'),
(3, 'Lisa Park', 1, 'VP Sales'),
(4, 'David Kim', 2, 'Engineering Manager'),
(5, 'Amy Torres', 2, 'Engineering Manager'),
(6, 'James Wilson', 4, 'Senior Developer'),
(7, 'Nina Patel', 4, 'Developer');

-- Query: Show full hierarchy with depth levels
WITH RECURSIVE org_chart AS (
    -- Anchor: start with the CEO (no manager)
    SELECT 
        id,
        name,
        title,
        manager_id,
        0 AS depth,
        CAST(name AS VARCHAR(1000)) AS path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive: find direct reports of current level
    SELECT 
        e.id,
        e.name,
        e.title,
        e.manager_id,
        oc.depth + 1,
        CAST(oc.path || ' > ' || e.name AS VARCHAR(1000))
    FROM employees e
    INNER JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT 
    REPEAT('  ', depth) || name AS employee,
    title,
    depth,
    path
FROM org_chart
ORDER BY path;

The path column builds a breadcrumb trail that’s useful for sorting hierarchically. The depth column tracks how many levels deep each employee sits, enabling indented display or filtering by management level.

Working with Graph Data: Bill of Materials

Manufacturing and e-commerce systems often need bill of materials (BOM) queries. A product contains components, which themselves contain sub-components. Recursive CTEs handle this naturally.

CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    unit_cost DECIMAL(10,2)
);

CREATE TABLE product_components (
    parent_id INT REFERENCES products(id),
    child_id INT REFERENCES products(id),
    quantity INT,
    PRIMARY KEY (parent_id, child_id)
);

-- Sample: A bicycle contains parts, some parts contain sub-parts
INSERT INTO products VALUES
(1, 'Bicycle', 0),
(2, 'Frame Assembly', 50.00),
(3, 'Wheel Assembly', 0),
(4, 'Frame', 45.00),
(5, 'Seat Post', 5.00),
(6, 'Rim', 15.00),
(7, 'Tire', 12.00),
(8, 'Spokes', 8.00);

INSERT INTO product_components VALUES
(1, 2, 1),  -- Bicycle needs 1 Frame Assembly
(1, 3, 2),  -- Bicycle needs 2 Wheel Assemblies
(2, 4, 1),  -- Frame Assembly needs 1 Frame
(2, 5, 1),  -- Frame Assembly needs 1 Seat Post
(3, 6, 1),  -- Wheel Assembly needs 1 Rim
(3, 7, 1),  -- Wheel Assembly needs 1 Tire
(3, 8, 36); -- Wheel Assembly needs 36 Spokes

-- BOM explosion: all components needed to build a Bicycle
WITH RECURSIVE bom AS (
    -- Anchor: start with top-level product
    SELECT 
        p.id,
        p.name,
        1 AS quantity,
        p.unit_cost,
        0 AS level,
        CAST(p.name AS VARCHAR(500)) AS assembly_path
    FROM products p
    WHERE p.id = 1  -- Bicycle
    
    UNION ALL
    
    -- Recursive: find all components
    SELECT 
        p.id,
        p.name,
        pc.quantity * bom.quantity,  -- Cumulative quantity
        p.unit_cost,
        bom.level + 1,
        CAST(bom.assembly_path || ' > ' || p.name AS VARCHAR(500))
    FROM products p
    INNER JOIN product_components pc ON p.id = pc.child_id
    INNER JOIN bom ON pc.parent_id = bom.id
)
SELECT 
    REPEAT('  ', level) || name AS component,
    quantity AS total_needed,
    unit_cost,
    quantity * unit_cost AS extended_cost
FROM bom
ORDER BY assembly_path;

The cumulative quantity calculation (pc.quantity * bom.quantity) is the key insight here. When you need 2 Wheel Assemblies and each needs 36 Spokes, the query correctly reports 72 total Spokes.

Generating Series and Date Ranges

Not every database has a generate_series() function. Recursive CTEs fill this gap, particularly useful for generating date ranges for reporting.

-- Generate a date series for the current month
WITH RECURSIVE date_series AS (
    SELECT DATE('2024-01-01') AS report_date
    
    UNION ALL
    
    SELECT report_date + INTERVAL '1 day'
    FROM date_series
    WHERE report_date < DATE('2024-01-31')
)
SELECT 
    ds.report_date,
    COALESCE(COUNT(o.id), 0) AS order_count,
    COALESCE(SUM(o.total), 0) AS daily_revenue
FROM date_series ds
LEFT JOIN orders o ON DATE(o.created_at) = ds.report_date
GROUP BY ds.report_date
ORDER BY ds.report_date;

This pattern ensures you see every date in your report, even days with zero orders. Without the date series, a simple GROUP BY would skip dates with no data.

Performance Considerations and Pitfalls

Recursive CTEs can be resource-intensive. The database materializes intermediate results and joins against them repeatedly. Here’s how to stay safe:

Set recursion limits explicitly:

-- SQL Server syntax
WITH org_chart AS (
    SELECT id, name, manager_id, 0 AS depth
    FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.id, e.name, e.manager_id, oc.depth + 1
    FROM employees e
    INNER JOIN org_chart oc ON e.manager_id = oc.id
    WHERE oc.depth < 100  -- Application-level limit
)
SELECT * FROM org_chart
OPTION (MAXRECURSION 100);  -- Database-level safety net

-- PostgreSQL: no MAXRECURSION, rely on WHERE clause
-- MySQL 8+: SET cte_max_recursion_depth = 100;

Index the join columns. The recursive member performs a join on every iteration. For employee hierarchies, index manager_id. For BOMs, index both parent_id and child_id.

Watch for exponential growth. If your data has multiple paths to the same node (a graph, not a tree), the result set can explode. Add cycle detection:

WITH RECURSIVE traverse AS (
    SELECT id, name, ARRAY[id] AS visited
    FROM nodes WHERE id = 1
    
    UNION ALL
    
    SELECT n.id, n.name, t.visited || n.id
    FROM nodes n
    INNER JOIN traverse t ON n.parent_id = t.id
    WHERE n.id != ALL(t.visited)  -- Cycle detection
)
SELECT * FROM traverse;

Practical Tips and Alternatives

Debug incrementally. Start with just the anchor member and verify it returns expected rows. Then add the recursive member with a restrictive WHERE clause (like depth < 2) and expand gradually.

Consider alternatives for deep hierarchies. If you query the same hierarchy frequently:

  • Closure tables store all ancestor-descendant pairs, enabling single-join queries
  • Materialized path stores the full path as a string column (/1/2/4/7/)
  • Nested sets use left/right boundary numbers for fast subtree queries
  • SQL Server’s hierarchyid type provides built-in hierarchy operations

Know your database’s syntax:

Feature PostgreSQL MySQL 8+ SQL Server
Keyword WITH RECURSIVE WITH RECURSIVE WITH
Max depth No built-in limit cte_max_recursion_depth MAXRECURSION hint
Array support Yes No No

Recursive CTEs are one of SQL’s most powerful features for hierarchical data. Master the anchor-union-recursive pattern, always include termination conditions, and index your join columns. When performance becomes critical, evaluate whether a denormalized structure like closure tables might serve you better for read-heavy workloads.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.