How to Use CTEs in SQLite
Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a single query. You define them using the `WITH` clause before your main query, and they act as...
Key Insights
- CTEs transform complex queries into readable, modular steps that you can build and test incrementally, making SQL maintenance significantly easier than nested subqueries.
- SQLite materializes CTEs once per query execution, which improves performance for queries that reference the same subquery multiple times but may add overhead for simple cases.
- Recursive CTEs unlock powerful hierarchical and graph traversal capabilities that would otherwise require application-level code or complex self-joins.
What Are CTEs and Why Use Them?
Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a single query. You define them using the WITH clause before your main query, and they act as virtual tables you can reference throughout your SQL statement.
CTEs solve three primary problems. First, they make complex queries readable by breaking logic into named, sequential steps. Second, they enable recursive queries for hierarchical data without procedural code. Third, they eliminate redundancy when you need to reference the same subquery multiple times.
Consider this query without a CTE:
SELECT p.product_name, p.price
FROM products p
WHERE p.price > (
SELECT AVG(price) FROM products WHERE category_id = p.category_id
)
AND p.category_id IN (
SELECT category_id FROM products WHERE category_id = p.category_id GROUP BY category_id HAVING COUNT(*) > 5
);
Now with a CTE:
WITH category_averages AS (
SELECT category_id, AVG(price) as avg_price
FROM products
GROUP BY category_id
HAVING COUNT(*) > 5
)
SELECT p.product_name, p.price
FROM products p
JOIN category_averages ca ON p.category_id = ca.category_id
WHERE p.price > ca.avg_price;
The CTE version is clearer, testable in isolation, and SQLite only calculates the category averages once.
Use CTEs when you need readable intermediate steps, recursive operations, or when the same subquery appears multiple times. Stick with subqueries for simple one-off filters. Avoid temporary tables unless you need to index intermediate results or use them across multiple queries.
Basic CTE Syntax and Simple Examples
The basic CTE syntax starts with WITH, followed by the CTE name, an optional column list, and the query definition:
WITH cte_name (column1, column2) AS (
SELECT ...
)
SELECT * FROM cte_name;
Here’s a practical example finding employees earning above their department’s average:
WITH department_averages AS (
SELECT
department_id,
AVG(salary) as avg_salary
FROM employees
GROUP BY department_id
)
SELECT
e.employee_name,
e.salary,
e.department_id,
da.avg_salary,
ROUND(((e.salary - da.avg_salary) / da.avg_salary) * 100, 2) as percent_above_avg
FROM employees e
JOIN department_averages da ON e.department_id = da.department_id
WHERE e.salary > da.avg_salary
ORDER BY percent_above_avg DESC;
This query calculates department averages once, then joins them back to find high earners. The department_averages CTE acts like a table, but SQLite discards it after query execution.
You can reference CTEs in SELECT, INSERT, UPDATE, or DELETE statements. They’re scoped to the statement where they’re defined—you cannot use them in subsequent queries.
Multiple CTEs and Chaining
Define multiple CTEs by separating them with commas after the initial WITH keyword. Later CTEs can reference earlier ones, letting you build complex logic incrementally:
WITH
monthly_sales AS (
SELECT
strftime('%Y-%m', order_date) as month,
SUM(total_amount) as monthly_total
FROM orders
WHERE order_date >= date('now', '-12 months')
GROUP BY strftime('%Y-%m', order_date)
),
yearly_average AS (
SELECT AVG(monthly_total) as avg_monthly_sales
FROM monthly_sales
),
performance_analysis AS (
SELECT
ms.month,
ms.monthly_total,
ya.avg_monthly_sales,
ms.monthly_total - ya.avg_monthly_sales as variance,
CASE
WHEN ms.monthly_total > ya.avg_monthly_sales * 1.1 THEN 'Excellent'
WHEN ms.monthly_total > ya.avg_monthly_sales THEN 'Good'
ELSE 'Below Average'
END as performance
FROM monthly_sales ms
CROSS JOIN yearly_average ya
)
SELECT * FROM performance_analysis
ORDER BY month DESC;
This query chains three CTEs: monthly_sales aggregates raw data, yearly_average summarizes those results, and performance_analysis combines both to categorize each month’s performance. Each step is independently testable—comment out the final SELECT and query any intermediate CTE directly.
This approach transforms what would be an unreadable nested query into a clear data pipeline. You’re essentially writing SQL like procedural code, with each CTE representing a transformation step.
Recursive CTEs
Recursive CTEs are where SQLite’s CTE implementation becomes genuinely powerful. They consist of two parts joined by UNION ALL: an anchor member (the base case) and a recursive member (the iterative step).
The syntax requires the RECURSIVE keyword:
WITH RECURSIVE cte_name AS (
-- Anchor member
SELECT ...
UNION ALL
-- Recursive member
SELECT ... FROM cte_name WHERE ...
)
SELECT * FROM cte_name;
Here’s a practical example traversing an organizational hierarchy:
WITH RECURSIVE employee_hierarchy AS (
-- Anchor: Start with the CEO
SELECT
employee_id,
employee_name,
manager_id,
1 as level,
employee_name as path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Find direct reports
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
eh.level + 1,
eh.path || ' > ' || e.employee_name
FROM employees e
JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
WHERE eh.level < 10 -- Prevent infinite loops
)
SELECT
level,
employee_name,
path
FROM employee_hierarchy
ORDER BY level, employee_name;
The anchor member selects top-level employees (where manager_id IS NULL). The recursive member joins back to the CTE itself, finding each employee’s direct reports. SQLite executes this iteratively until no new rows are found.
Always include a termination condition (like level < 10) to prevent infinite loops. Recursive CTEs will run until they stop producing new rows or hit SQLite’s recursion limit.
Another common use case is generating sequences:
WITH RECURSIVE date_series AS (
SELECT date('now', '-30 days') as date
UNION ALL
SELECT date(date, '+1 day')
FROM date_series
WHERE date < date('now')
)
SELECT date FROM date_series;
This generates the last 30 days, useful for filling gaps in time-series data.
Practical Use Cases
CTEs excel at data transformation pipelines where you need multiple calculation steps. Here’s a product category tree with depth indicators:
WITH RECURSIVE category_tree AS (
-- Root categories
SELECT
category_id,
category_name,
parent_category_id,
0 as depth,
category_name as full_path,
CAST(category_id AS TEXT) as sort_path
FROM categories
WHERE parent_category_id IS NULL
UNION ALL
-- Child categories
SELECT
c.category_id,
c.category_name,
c.parent_category_id,
ct.depth + 1,
ct.full_path || ' / ' || c.category_name,
ct.sort_path || '-' || CAST(c.category_id AS TEXT)
FROM categories c
JOIN category_tree ct ON c.parent_category_id = ct.category_id
),
category_stats AS (
SELECT
category_id,
COUNT(*) as product_count,
AVG(price) as avg_price
FROM products
GROUP BY category_id
)
SELECT
ct.depth,
SUBSTR(' ', 1, ct.depth * 2) || ct.category_name as indented_name,
ct.full_path,
COALESCE(cs.product_count, 0) as products,
COALESCE(cs.avg_price, 0) as avg_price
FROM category_tree ct
LEFT JOIN category_stats cs ON ct.category_id = cs.category_id
ORDER BY ct.sort_path;
This query builds a complete category hierarchy with indentation, full paths, and product statistics. The sort_path ensures proper tree ordering, and the LEFT JOIN handles categories without products.
Use this pattern for org charts, comment threads, file systems, or any hierarchical structure. The recursive CTE handles arbitrary depth without knowing the tree structure in advance.
Performance Considerations and Best Practices
SQLite materializes CTEs once per query execution, storing results in a temporary table. This helps when you reference the same CTE multiple times but adds overhead for simple cases where a subquery would suffice.
Compare these approaches:
-- CTE approach (materialized once)
WITH recent_orders AS (
SELECT customer_id, SUM(total) as total_spent
FROM orders
WHERE order_date > date('now', '-30 days')
GROUP BY customer_id
)
SELECT c.name, ro.total_spent
FROM customers c
JOIN recent_orders ro ON c.customer_id = ro.customer_id
WHERE ro.total_spent > 1000;
-- Subquery approach (optimized by query planner)
SELECT c.name, o.total_spent
FROM customers c
JOIN (
SELECT customer_id, SUM(total) as total_spent
FROM orders
WHERE order_date > date('now', '-30 days')
GROUP BY customer_id
) o ON c.customer_id = o.customer_id
WHERE o.total_spent > 1000;
Check the query plan:
EXPLAIN QUERY PLAN
WITH recent_orders AS (...)
SELECT ...;
Look for “MATERIALIZE” in the output. If the CTE is small or only referenced once, a subquery might perform better. If referenced multiple times or very large, materialization helps.
For performance-critical queries, test both approaches. CTEs shine when readability matters and performance is acceptable. For hot paths, optimize aggressively—sometimes that means sacrificing readability for speed.
When debugging complex CTEs, test each one independently:
WITH cte1 AS (...),
cte2 AS (...),
cte3 AS (...)
SELECT * FROM cte2; -- Test intermediate results
Build your query incrementally, verifying each CTE produces expected results before adding the next layer.
CTEs don’t support indexes, so ensure your underlying tables are properly indexed. The query planner can push predicates into CTE definitions, but complex CTEs may prevent optimization. Keep CTEs focused and let the main query handle final filtering when possible.