SQL - Recursive CTE with Examples
A Common Table Expression (CTE) is a temporary named result set that exists only for the duration of a single query. Think of it as a disposable view that makes complex queries readable and...
Key Insights
- Recursive CTEs follow a simple pattern: an anchor query that starts the recursion, UNION ALL, and a recursive query that references itself—master this structure and you can traverse any hierarchical data.
- The most practical applications are organizational hierarchies, bill of materials explosions, and generating date/number sequences without relying on database-specific functions.
- Always set MAXRECURSION limits and include termination conditions in your recursive member to prevent runaway queries that consume server resources.
What Are Common Table Expressions?
A Common Table Expression (CTE) is a temporary named result set that exists only for the duration of a single query. Think of it as a disposable view that makes complex queries readable and maintainable.
Before diving into recursion, let’s establish the basic syntax. Here’s a subquery approach versus a CTE for the same logic:
-- Subquery approach: harder to read, especially when nested
SELECT department_name, avg_salary
FROM (
SELECT department_id, AVG(salary) as avg_salary
FROM employees
GROUP BY department_id
) dept_salaries
JOIN departments d ON d.id = dept_salaries.department_id
WHERE avg_salary > 75000;
-- CTE approach: logic flows top to bottom
WITH dept_salaries AS (
SELECT department_id, AVG(salary) as avg_salary
FROM employees
GROUP BY department_id
)
SELECT department_name, avg_salary
FROM dept_salaries
JOIN departments d ON d.id = dept_salaries.department_id
WHERE avg_salary > 75000;
The CTE version reads like a story: first calculate department salaries, then join and filter. This becomes crucial when you need to reference the same derived table multiple times or when building recursive queries.
Anatomy of a Recursive CTE
A recursive CTE has three components that execute in a specific order:
- Anchor member: The initial query that provides the base result set (no self-reference)
- UNION ALL: Combines anchor results with recursive results
- Recursive member: References the CTE itself, processing rows from the previous iteration
The database executes the anchor first, then repeatedly executes the recursive member using the previous iteration’s results until no new rows are returned.
-- Generate numbers 1 through 10
WITH RECURSIVE counter AS (
-- Anchor: start at 1
SELECT 1 AS n
UNION ALL
-- Recursive: add 1 to previous value
SELECT n + 1
FROM counter
WHERE n < 10 -- Termination condition
)
SELECT n FROM counter;
Note: SQL Server omits the RECURSIVE keyword. PostgreSQL and MySQL 8+ require it.
Here’s the execution flow:
| Iteration | Rows Produced | n values |
|---|---|---|
| 1 (anchor) | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| … | … | … |
| 10 | 1 | 10 |
| 11 | 0 | (none, recursion stops) |
The WHERE n < 10 condition is critical. Without it, the query runs indefinitely until the database kills it or hits a recursion limit.
Hierarchical Data: Employee-Manager Trees
The classic recursive CTE use case is traversing organizational hierarchies. Given a table where each employee references their manager’s ID, you can build the complete reporting chain.
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT REFERENCES employees(id),
title VARCHAR(100)
);
-- Sample data
INSERT INTO employees VALUES
(1, 'Sarah Chen', NULL, 'CEO'),
(2, 'Marcus Johnson', 1, 'VP Engineering'),
(3, 'Lisa Park', 1, 'VP Sales'),
(4, 'David Kim', 2, 'Engineering Manager'),
(5, 'Amy Torres', 2, 'Engineering Manager'),
(6, 'James Wilson', 4, 'Senior Developer'),
(7, 'Nina Patel', 4, 'Developer');
-- Query: Show full hierarchy with depth levels
WITH RECURSIVE org_chart AS (
-- Anchor: start with the CEO (no manager)
SELECT
id,
name,
title,
manager_id,
0 AS depth,
CAST(name AS VARCHAR(1000)) AS path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: find direct reports of current level
SELECT
e.id,
e.name,
e.title,
e.manager_id,
oc.depth + 1,
CAST(oc.path || ' > ' || e.name AS VARCHAR(1000))
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT
REPEAT(' ', depth) || name AS employee,
title,
depth,
path
FROM org_chart
ORDER BY path;
The path column builds a breadcrumb trail that’s useful for sorting hierarchically. The depth column tracks how many levels deep each employee sits, enabling indented display or filtering by management level.
Working with Graph Data: Bill of Materials
Manufacturing and e-commerce systems often need bill of materials (BOM) queries. A product contains components, which themselves contain sub-components. Recursive CTEs handle this naturally.
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(100),
unit_cost DECIMAL(10,2)
);
CREATE TABLE product_components (
parent_id INT REFERENCES products(id),
child_id INT REFERENCES products(id),
quantity INT,
PRIMARY KEY (parent_id, child_id)
);
-- Sample: A bicycle contains parts, some parts contain sub-parts
INSERT INTO products VALUES
(1, 'Bicycle', 0),
(2, 'Frame Assembly', 50.00),
(3, 'Wheel Assembly', 0),
(4, 'Frame', 45.00),
(5, 'Seat Post', 5.00),
(6, 'Rim', 15.00),
(7, 'Tire', 12.00),
(8, 'Spokes', 8.00);
INSERT INTO product_components VALUES
(1, 2, 1), -- Bicycle needs 1 Frame Assembly
(1, 3, 2), -- Bicycle needs 2 Wheel Assemblies
(2, 4, 1), -- Frame Assembly needs 1 Frame
(2, 5, 1), -- Frame Assembly needs 1 Seat Post
(3, 6, 1), -- Wheel Assembly needs 1 Rim
(3, 7, 1), -- Wheel Assembly needs 1 Tire
(3, 8, 36); -- Wheel Assembly needs 36 Spokes
-- BOM explosion: all components needed to build a Bicycle
WITH RECURSIVE bom AS (
-- Anchor: start with top-level product
SELECT
p.id,
p.name,
1 AS quantity,
p.unit_cost,
0 AS level,
CAST(p.name AS VARCHAR(500)) AS assembly_path
FROM products p
WHERE p.id = 1 -- Bicycle
UNION ALL
-- Recursive: find all components
SELECT
p.id,
p.name,
pc.quantity * bom.quantity, -- Cumulative quantity
p.unit_cost,
bom.level + 1,
CAST(bom.assembly_path || ' > ' || p.name AS VARCHAR(500))
FROM products p
INNER JOIN product_components pc ON p.id = pc.child_id
INNER JOIN bom ON pc.parent_id = bom.id
)
SELECT
REPEAT(' ', level) || name AS component,
quantity AS total_needed,
unit_cost,
quantity * unit_cost AS extended_cost
FROM bom
ORDER BY assembly_path;
The cumulative quantity calculation (pc.quantity * bom.quantity) is the key insight here. When you need 2 Wheel Assemblies and each needs 36 Spokes, the query correctly reports 72 total Spokes.
Generating Series and Date Ranges
Not every database has a generate_series() function. Recursive CTEs fill this gap, particularly useful for generating date ranges for reporting.
-- Generate a date series for the current month
WITH RECURSIVE date_series AS (
SELECT DATE('2024-01-01') AS report_date
UNION ALL
SELECT report_date + INTERVAL '1 day'
FROM date_series
WHERE report_date < DATE('2024-01-31')
)
SELECT
ds.report_date,
COALESCE(COUNT(o.id), 0) AS order_count,
COALESCE(SUM(o.total), 0) AS daily_revenue
FROM date_series ds
LEFT JOIN orders o ON DATE(o.created_at) = ds.report_date
GROUP BY ds.report_date
ORDER BY ds.report_date;
This pattern ensures you see every date in your report, even days with zero orders. Without the date series, a simple GROUP BY would skip dates with no data.
Performance Considerations and Pitfalls
Recursive CTEs can be resource-intensive. The database materializes intermediate results and joins against them repeatedly. Here’s how to stay safe:
Set recursion limits explicitly:
-- SQL Server syntax
WITH org_chart AS (
SELECT id, name, manager_id, 0 AS depth
FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, oc.depth + 1
FROM employees e
INNER JOIN org_chart oc ON e.manager_id = oc.id
WHERE oc.depth < 100 -- Application-level limit
)
SELECT * FROM org_chart
OPTION (MAXRECURSION 100); -- Database-level safety net
-- PostgreSQL: no MAXRECURSION, rely on WHERE clause
-- MySQL 8+: SET cte_max_recursion_depth = 100;
Index the join columns. The recursive member performs a join on every iteration. For employee hierarchies, index manager_id. For BOMs, index both parent_id and child_id.
Watch for exponential growth. If your data has multiple paths to the same node (a graph, not a tree), the result set can explode. Add cycle detection:
WITH RECURSIVE traverse AS (
SELECT id, name, ARRAY[id] AS visited
FROM nodes WHERE id = 1
UNION ALL
SELECT n.id, n.name, t.visited || n.id
FROM nodes n
INNER JOIN traverse t ON n.parent_id = t.id
WHERE n.id != ALL(t.visited) -- Cycle detection
)
SELECT * FROM traverse;
Practical Tips and Alternatives
Debug incrementally. Start with just the anchor member and verify it returns expected rows. Then add the recursive member with a restrictive WHERE clause (like depth < 2) and expand gradually.
Consider alternatives for deep hierarchies. If you query the same hierarchy frequently:
- Closure tables store all ancestor-descendant pairs, enabling single-join queries
- Materialized path stores the full path as a string column (
/1/2/4/7/) - Nested sets use left/right boundary numbers for fast subtree queries
- SQL Server’s hierarchyid type provides built-in hierarchy operations
Know your database’s syntax:
| Feature | PostgreSQL | MySQL 8+ | SQL Server |
|---|---|---|---|
| Keyword | WITH RECURSIVE | WITH RECURSIVE | WITH |
| Max depth | No built-in limit | cte_max_recursion_depth | MAXRECURSION hint |
| Array support | Yes | No | No |
Recursive CTEs are one of SQL’s most powerful features for hierarchical data. Master the anchor-union-recursive pattern, always include termination conditions, and index your join columns. When performance becomes critical, evaluate whether a denormalized structure like closure tables might serve you better for read-heavy workloads.