How to Use Recursive CTEs in MySQL
Common Table Expressions (CTEs) are named temporary result sets that exist only during query execution. Think of them as inline views that improve readability and enable complex query patterns. MySQL...
Key Insights
- Recursive CTEs in MySQL 8.0+ replace procedural loops for hierarchical queries, making tree traversal and graph operations possible in pure SQL
- Every recursive CTE requires an anchor member (base case) and a recursive member joined with UNION ALL, with an implicit termination when no rows are returned
- Performance depends heavily on proper indexing of join columns and setting appropriate recursion depth limits—default is 1000 iterations, which you can adjust per session
Introduction to CTEs and Recursive CTEs
Common Table Expressions (CTEs) are named temporary result sets that exist only during query execution. Think of them as inline views that improve readability and enable complex query patterns. MySQL added CTE support in version 8.0, finally catching up with PostgreSQL and SQL Server.
A recursive CTE references itself, allowing you to traverse hierarchical data structures, generate sequences, or perform graph operations—tasks that previously required stored procedures or application-level loops. If you’re still on MySQL 5.7, you’re out of luck. Upgrade to 8.0 or later.
Here’s a simple non-recursive CTE to establish the syntax:
WITH regional_sales AS (
SELECT region, SUM(amount) as total_sales
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY region
)
SELECT region, total_sales
FROM regional_sales
WHERE total_sales > 100000
ORDER BY total_sales DESC;
The WITH clause defines the CTE, then you query it like any table. Recursive CTEs follow the same pattern but add the RECURSIVE keyword and self-reference.
Anatomy of a Recursive CTE
A recursive CTE has three essential components:
- Anchor member: The base case that starts the recursion
- Recursive member: References the CTE itself to build on previous results
- UNION ALL: Combines anchor and recursive members (must be UNION ALL, not UNION)
The recursion terminates automatically when the recursive member returns no rows. No explicit stop condition needed, though you should always have an implicit one to prevent infinite loops.
WITH RECURSIVE cte_name AS (
-- 1. Anchor member: initial row(s)
SELECT id, parent_id, name, 1 as level
FROM base_table
WHERE parent_id IS NULL
UNION ALL
-- 2. Recursive member: references cte_name
SELECT t.id, t.parent_id, t.name, cte.level + 1
FROM base_table t
INNER JOIN cte_name cte ON t.parent_id = cte.id
-- 3. Implicit termination when no rows match the join
)
SELECT * FROM cte_name;
The database executes the anchor member first, then repeatedly executes the recursive member using the previous iteration’s results until no new rows are produced.
Generating Sequential Data
Recursive CTEs excel at generating sequences without needing a numbers table. This is useful for filling gaps in time series data or creating test datasets.
Generate numbers 1 through 10:
WITH RECURSIVE numbers AS (
SELECT 1 as n
UNION ALL
SELECT n + 1
FROM numbers
WHERE n < 10
)
SELECT n FROM numbers;
The anchor member produces 1. The recursive member adds 1 each iteration until n reaches 10, satisfying the termination condition.
Create a date range for the past 30 days:
WITH RECURSIVE date_range AS (
SELECT CURDATE() as date
UNION ALL
SELECT DATE_SUB(date, INTERVAL 1 DAY)
FROM date_range
WHERE date > DATE_SUB(CURDATE(), INTERVAL 30 DAY)
)
SELECT date
FROM date_range
ORDER BY date;
This pattern is invaluable for reports that need to show every day in a range, even when no data exists for some dates. Join this CTE with your fact table using a LEFT JOIN to preserve all dates.
Traversing Hierarchical Data
The most common real-world use case: navigating organizational charts, category trees, or any parent-child relationship stored in a single table.
Consider an employee table with manager relationships:
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT,
title VARCHAR(100)
);
-- Find all employees under a specific manager (including indirect reports)
WITH RECURSIVE employee_hierarchy AS (
-- Anchor: start with the top manager
SELECT id, name, manager_id, title, 1 as level
FROM employees
WHERE id = 5 -- CEO or department head
UNION ALL
-- Recursive: find direct reports of each person in the hierarchy
SELECT e.id, e.name, e.manager_id, e.title, eh.level + 1
FROM employees e
INNER JOIN employee_hierarchy eh ON e.manager_id = eh.id
)
SELECT
CONCAT(REPEAT(' ', level - 1), name) as org_chart,
title,
level
FROM employee_hierarchy
ORDER BY level, name;
The REPEAT function indents names based on hierarchy level, creating a visual org chart. The level column tracks depth, useful for limiting recursion or analyzing organizational structure.
For product categories with unlimited nesting:
WITH RECURSIVE category_tree AS (
-- Anchor: top-level categories
SELECT id, name, parent_id, name as path
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive: child categories
SELECT c.id, c.name, c.parent_id,
CONCAT(ct.path, ' > ', c.name) as path
FROM categories c
INNER JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT id, name, path
FROM category_tree
ORDER BY path;
The path column builds a breadcrumb trail showing the full category hierarchy, useful for navigation menus or filtering.
Graph Traversal and Path Finding
Beyond simple trees, recursive CTEs handle graph structures like bill of materials, network routes, or social connections.
Bill of materials explosion (find all components needed to build a product):
CREATE TABLE bill_of_materials (
product_id INT,
component_id INT,
quantity INT,
PRIMARY KEY (product_id, component_id)
);
WITH RECURSIVE bom_explosion AS (
-- Anchor: top-level product
SELECT
product_id,
component_id,
quantity,
1 as level,
quantity as total_quantity
FROM bill_of_materials
WHERE product_id = 100 -- Final product
UNION ALL
-- Recursive: components of components
SELECT
bom.product_id,
bom.component_id,
bom.quantity,
be.level + 1,
be.total_quantity * bom.quantity
FROM bill_of_materials bom
INNER JOIN bom_explosion be ON bom.product_id = be.component_id
WHERE be.level < 10 -- Prevent excessive depth
)
SELECT
component_id,
SUM(total_quantity) as total_needed,
MAX(level) as max_depth
FROM bom_explosion
GROUP BY component_id
ORDER BY max_depth, component_id;
The total_quantity calculation multiplies quantities at each level, giving you the actual number of each component needed. The level < 10 condition prevents runaway recursion in case of circular references (which shouldn’t exist in a proper BOM but defensive coding never hurts).
Performance Considerations and Limits
MySQL enforces a recursion depth limit to prevent infinite loops from consuming server resources. The default is 1000 iterations, controlled by the cte_max_recursion_depth session variable.
-- Increase limit for deep hierarchies
SET SESSION cte_max_recursion_depth = 5000;
-- Or set to 0 for unlimited (dangerous!)
SET SESSION cte_max_recursion_depth = 0;
-- Check current setting
SELECT @@cte_max_recursion_depth;
Performance tips:
Index your join columns: The recursive member joins the CTE with your base table. Without proper indexes, each iteration performs a full table scan. For the employee hierarchy example, index manager_id. For categories, index parent_id.
Limit recursion explicitly: Don’t rely solely on the depth limit. Add WHERE level < X conditions to stop recursion at a known depth.
Avoid recursive CTEs for large result sets: If you’re returning millions of rows, consider materializing intermediate results in temporary tables or redesigning your schema to use nested sets or path enumeration models.
Use EXPLAIN to analyze: Check the execution plan. If you see table scans on large tables, you need indexes.
When NOT to use recursive CTEs:
- Simple parent-child lookups (a single join suffices)
- Known fixed depth hierarchies (explicit joins are faster)
- Graphs with cycles unless you implement cycle detection
- Performance-critical paths where milliseconds matter
Common Pitfalls and Best Practices
Infinite loops: The most dangerous pitfall. Always ensure your recursive member eventually produces zero rows.
-- BAD: Will run until hitting recursion limit
WITH RECURSIVE infinite AS (
SELECT 1 as n
UNION ALL
SELECT n + 1 FROM infinite -- No termination condition!
)
SELECT * FROM infinite;
Cycle detection: For graph structures that might contain cycles, track visited nodes:
WITH RECURSIVE paths AS (
-- Anchor: starting node
SELECT
id,
parent_id,
CAST(id AS CHAR(1000)) as path,
0 as is_cycle
FROM graph
WHERE id = 1
UNION ALL
-- Recursive: follow edges, detect cycles
SELECT
g.id,
g.parent_id,
CONCAT(p.path, ',', g.id),
FIND_IN_SET(g.id, p.path) > 0 as is_cycle
FROM graph g
INNER JOIN paths p ON g.parent_id = p.id
WHERE p.is_cycle = 0 -- Stop following cyclic paths
)
SELECT * FROM paths;
The path column concatenates visited node IDs. FIND_IN_SET checks if the current node already exists in the path, indicating a cycle.
Memory consumption: Each iteration’s results are held in memory. Deep recursion with wide result sets can exhaust server memory. Monitor your queries during development.
Use meaningful column names: Include level, depth, or path columns to make results self-documenting and enable better filtering.
Recursive CTEs transform how you handle hierarchical data in MySQL. They replace complex application code with declarative SQL, making your queries more maintainable and often more performant. Master the anchor-recursive-termination pattern, index appropriately, and you’ll handle everything from org charts to graph traversal with confidence.