How to Use Recursive CTEs in SQLite
Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a query. They make complex SQL more readable by breaking it into logical chunks. A standard CTE...
Key Insights
- Recursive CTEs let you query hierarchical and graph-based data in SQLite without application-level loops, using a base case plus a recursive step that references itself until a termination condition is met.
- The recursive member executes iteratively—not truly recursively—building results row by row until no new rows are generated, making it essential to design proper termination conditions to prevent infinite loops.
- While powerful for org charts, bill of materials, and graph traversal, recursive CTEs can be slow on large datasets; consider alternative patterns like closure tables when performance matters more than schema simplicity.
Introduction to CTEs and Recursive CTEs
Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a query. They make complex SQL more readable by breaking it into logical chunks. A standard CTE looks like this:
WITH active_users AS (
SELECT * FROM users WHERE last_login > date('now', '-30 days')
)
SELECT * FROM active_users WHERE premium = 1;
Recursive CTEs take this further by allowing the CTE to reference itself, enabling queries that would otherwise require procedural code. You need SQLite 3.8.3 or later (released August 2014) to use them. If you’re on any remotely modern SQLite version, you’re fine.
Use recursive CTEs when your data has hierarchical relationships (employees reporting to managers), graph structures (social networks, dependencies), or when you need to generate sequences. They’re particularly valuable when the depth of the hierarchy is unknown at query time.
Anatomy of a Recursive CTE
The syntax follows a specific pattern:
WITH RECURSIVE cte_name (column_list) AS (
-- Base case (anchor member)
SELECT initial_values
UNION ALL
-- Recursive case (recursive member)
SELECT new_values
FROM cte_name
WHERE termination_condition
)
SELECT * FROM cte_name;
The base case runs first and provides the initial rows. Think of it as the seed data. The recursive case then references the CTE itself, operating on the results from the previous iteration. SQLite executes this iteratively: run the base case, then run the recursive case on those results, then run the recursive case on the new results, and so on.
UNION ALL combines results from each iteration. Use UNION ALL rather than UNION unless you specifically need to eliminate duplicates (which adds performance overhead). The recursion continues until the recursive member produces no new rows.
The termination condition is critical. Without it, you’ll create an infinite loop. SQLite has a default recursion limit (usually 1000), but hitting it means your query design needs work.
Practical Example: Generating Number Sequences
Let’s generate numbers from 1 to 10:
WITH RECURSIVE numbers(n) AS (
-- Base case: start with 1
SELECT 1
UNION ALL
-- Recursive case: add 1 until we reach 10
SELECT n + 1
FROM numbers
WHERE n < 10
)
SELECT n FROM numbers;
Here’s how SQLite executes this:
- Iteration 0 (base case): Returns
1 - Iteration 1: Takes
1, adds 1, returns2(WHERE 1 < 10 is true) - Iteration 2: Takes
2, adds 1, returns3(WHERE 2 < 10 is true) - Iterations 3-9: Continue the pattern
- Iteration 10: Takes
10, but WHERE 10 < 10 is false, returns nothing - Recursion stops: No new rows generated
The final result set contains all accumulated rows: 1 through 10.
This pattern is useful for generating date ranges, filling gaps in sequences, or creating test data:
WITH RECURSIVE dates(d) AS (
SELECT date('2024-01-01')
UNION ALL
SELECT date(d, '+1 day')
FROM dates
WHERE d < date('2024-01-31')
)
SELECT d FROM dates;
Hierarchical Data: Organization Chart
Recursive CTEs shine with hierarchical data. Consider this employee table:
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
manager_id INTEGER,
FOREIGN KEY (manager_id) REFERENCES employees(id)
);
INSERT INTO employees VALUES
(1, 'Alice CEO', NULL),
(2, 'Bob VP Eng', 1),
(3, 'Carol VP Sales', 1),
(4, 'Dave Eng Manager', 2),
(5, 'Eve Sales Manager', 3),
(6, 'Frank Engineer', 4),
(7, 'Grace Engineer', 4);
To find all employees under Bob (including Bob himself) with their hierarchy level:
WITH RECURSIVE org_chart(id, name, manager_id, level) AS (
-- Base case: start with Bob
SELECT id, name, manager_id, 0
FROM employees
WHERE name = 'Bob VP Eng'
UNION ALL
-- Recursive case: find direct reports
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT
level,
name,
' ' || substr('................', 1, level * 2) || name as indented_name
FROM org_chart
ORDER BY level, name;
Results:
level | name | indented_name
------|-------------------|------------------
0 | Bob VP Eng | Bob VP Eng
1 | Dave Eng Manager | Dave Eng Manager
2 | Frank Engineer | Frank Engineer
2 | Grace Engineer | Grace Engineer
The recursion traverses downward through the reporting chain. Each iteration finds employees whose manager_id matches an id from the previous iteration. The level column increments with each step, showing depth in the hierarchy.
Advanced Pattern: Bill of Materials (BOM)
A bill of materials represents products made from components, which themselves may be made from sub-components. This requires a many-to-many relationship:
CREATE TABLE parts (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
cost REAL NOT NULL
);
CREATE TABLE assemblies (
parent_id INTEGER,
child_id INTEGER,
quantity INTEGER NOT NULL,
PRIMARY KEY (parent_id, child_id),
FOREIGN KEY (parent_id) REFERENCES parts(id),
FOREIGN KEY (child_id) REFERENCES parts(id)
);
INSERT INTO parts VALUES
(1, 'Bicycle', 0),
(2, 'Frame', 100),
(3, 'Wheel', 50),
(4, 'Rim', 20),
(5, 'Tire', 15),
(6, 'Spoke', 0.50);
INSERT INTO assemblies VALUES
(1, 2, 1), -- Bicycle needs 1 Frame
(1, 3, 2), -- Bicycle needs 2 Wheels
(3, 4, 1), -- Wheel needs 1 Rim
(3, 5, 1), -- Wheel needs 1 Tire
(3, 6, 32); -- Wheel needs 32 Spokes
To explode the BOM for a bicycle and calculate total cost:
WITH RECURSIVE bom(parent_id, child_id, quantity, depth) AS (
-- Base case: top-level product
SELECT id, id, 1, 0
FROM parts
WHERE name = 'Bicycle'
UNION ALL
-- Recursive case: components and sub-components
SELECT bom.parent_id, a.child_id, bom.quantity * a.quantity, bom.depth + 1
FROM bom
JOIN assemblies a ON bom.child_id = a.parent_id
)
SELECT
p.name,
SUM(bom.quantity) as total_quantity,
p.cost,
SUM(bom.quantity * p.cost) as total_cost
FROM bom
JOIN parts p ON bom.child_id = p.id
WHERE bom.child_id != bom.parent_id -- Exclude the root item itself
GROUP BY p.name, p.cost
ORDER BY p.name;
This query multiplies quantities as it descends the hierarchy. A bicycle needs 2 wheels, each wheel needs 32 spokes, so the total spoke count is 64.
For circular references (Part A contains Part B, Part B contains Part A), add cycle detection:
WITH RECURSIVE bom(parent_id, child_id, quantity, path) AS (
SELECT id, id, 1, CAST(id AS TEXT)
FROM parts WHERE name = 'Bicycle'
UNION ALL
SELECT bom.parent_id, a.child_id, bom.quantity * a.quantity,
bom.path || ',' || a.child_id
FROM bom
JOIN assemblies a ON bom.child_id = a.parent_id
WHERE bom.path NOT LIKE '%,' || a.child_id || ',%'
)
SELECT * FROM bom;
The path column tracks visited nodes, preventing infinite loops.
Performance Considerations and Limitations
SQLite’s default recursion limit is 1000 iterations, changeable with PRAGMA recursive_triggers. If you hit this limit, your data model or query needs rethinking.
Recursive CTEs work well for:
- Hierarchies with depth < 100
- Graphs with a few thousand nodes
- Queries run occasionally, not in hot paths
They perform poorly for:
- Deep hierarchies (organizational charts with 20+ levels)
- Queries run thousands of times per second
- Very wide trees (nodes with hundreds of children)
Index the foreign key columns used in joins:
CREATE INDEX idx_employees_manager ON employees(manager_id);
CREATE INDEX idx_assemblies_parent ON assemblies(parent_id);
For frequently-queried hierarchies, consider a closure table that pre-computes all ancestor-descendant relationships:
CREATE TABLE employee_closure (
ancestor_id INTEGER,
descendant_id INTEGER,
depth INTEGER,
PRIMARY KEY (ancestor_id, descendant_id)
);
This trades write complexity and storage for read performance. Or use the nested set model, which represents trees as ranges but makes updates expensive.
Common Pitfalls and Debugging Tips
Missing termination condition is the most common mistake:
-- WRONG: infinite loop
WITH RECURSIVE bad(n) AS (
SELECT 1
UNION ALL
SELECT n + 1 FROM bad -- No WHERE clause!
)
SELECT * FROM bad LIMIT 10;
This will hit the recursion limit. Always include a WHERE clause that eventually becomes false.
Cartesian products happen when you forget join conditions:
-- WRONG: explodes exponentially
WITH RECURSIVE bad(id, name) AS (
SELECT id, name FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name
FROM employees e, bad -- Missing ON clause
)
SELECT * FROM bad;
Debug with LIMIT to see intermediate results:
WITH RECURSIVE debug(n, iteration) AS (
SELECT 1, 1
UNION ALL
SELECT n * 2, iteration + 1
FROM debug
WHERE iteration < 5
)
SELECT * FROM debug;
This shows exactly what each iteration produces.
Use EXPLAIN QUERY PLAN to verify indexes are being used:
EXPLAIN QUERY PLAN
WITH RECURSIVE org AS (
SELECT id FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.id FROM employees e JOIN org ON e.manager_id = org.id
)
SELECT * FROM org;
Look for “USING INDEX” in the output. If you see “SCAN TABLE”, add indexes.
Recursive CTEs are powerful tools for hierarchical data in SQLite. Master the base case/recursive case pattern, always include termination conditions, and know when to use alternative approaches for performance-critical code. They eliminate the need for application-level loops and make complex queries readable and maintainable.