How to Use CTEs in SQLite

Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a single query. You define them using the `WITH` clause before your main query, and they act as...

Key Insights

  • CTEs transform complex queries into readable, modular steps that you can build and test incrementally, making SQL maintenance significantly easier than nested subqueries.
  • SQLite materializes CTEs once per query execution, which improves performance for queries that reference the same subquery multiple times but may add overhead for simple cases.
  • Recursive CTEs unlock powerful hierarchical and graph traversal capabilities that would otherwise require application-level code or complex self-joins.

What Are CTEs and Why Use Them?

Common Table Expressions (CTEs) are named temporary result sets that exist only for the duration of a single query. You define them using the WITH clause before your main query, and they act as virtual tables you can reference throughout your SQL statement.

CTEs solve three primary problems. First, they make complex queries readable by breaking logic into named, sequential steps. Second, they enable recursive queries for hierarchical data without procedural code. Third, they eliminate redundancy when you need to reference the same subquery multiple times.

Consider this query without a CTE:

SELECT p.product_name, p.price
FROM products p
WHERE p.price > (
    SELECT AVG(price) FROM products WHERE category_id = p.category_id
)
AND p.category_id IN (
    SELECT category_id FROM products WHERE category_id = p.category_id GROUP BY category_id HAVING COUNT(*) > 5
);

Now with a CTE:

WITH category_averages AS (
    SELECT category_id, AVG(price) as avg_price
    FROM products
    GROUP BY category_id
    HAVING COUNT(*) > 5
)
SELECT p.product_name, p.price
FROM products p
JOIN category_averages ca ON p.category_id = ca.category_id
WHERE p.price > ca.avg_price;

The CTE version is clearer, testable in isolation, and SQLite only calculates the category averages once.

Use CTEs when you need readable intermediate steps, recursive operations, or when the same subquery appears multiple times. Stick with subqueries for simple one-off filters. Avoid temporary tables unless you need to index intermediate results or use them across multiple queries.

Basic CTE Syntax and Simple Examples

The basic CTE syntax starts with WITH, followed by the CTE name, an optional column list, and the query definition:

WITH cte_name (column1, column2) AS (
    SELECT ...
)
SELECT * FROM cte_name;

Here’s a practical example finding employees earning above their department’s average:

WITH department_averages AS (
    SELECT 
        department_id,
        AVG(salary) as avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT 
    e.employee_name,
    e.salary,
    e.department_id,
    da.avg_salary,
    ROUND(((e.salary - da.avg_salary) / da.avg_salary) * 100, 2) as percent_above_avg
FROM employees e
JOIN department_averages da ON e.department_id = da.department_id
WHERE e.salary > da.avg_salary
ORDER BY percent_above_avg DESC;

This query calculates department averages once, then joins them back to find high earners. The department_averages CTE acts like a table, but SQLite discards it after query execution.

You can reference CTEs in SELECT, INSERT, UPDATE, or DELETE statements. They’re scoped to the statement where they’re defined—you cannot use them in subsequent queries.

Multiple CTEs and Chaining

Define multiple CTEs by separating them with commas after the initial WITH keyword. Later CTEs can reference earlier ones, letting you build complex logic incrementally:

WITH 
monthly_sales AS (
    SELECT 
        strftime('%Y-%m', order_date) as month,
        SUM(total_amount) as monthly_total
    FROM orders
    WHERE order_date >= date('now', '-12 months')
    GROUP BY strftime('%Y-%m', order_date)
),
yearly_average AS (
    SELECT AVG(monthly_total) as avg_monthly_sales
    FROM monthly_sales
),
performance_analysis AS (
    SELECT 
        ms.month,
        ms.monthly_total,
        ya.avg_monthly_sales,
        ms.monthly_total - ya.avg_monthly_sales as variance,
        CASE 
            WHEN ms.monthly_total > ya.avg_monthly_sales * 1.1 THEN 'Excellent'
            WHEN ms.monthly_total > ya.avg_monthly_sales THEN 'Good'
            ELSE 'Below Average'
        END as performance
    FROM monthly_sales ms
    CROSS JOIN yearly_average ya
)
SELECT * FROM performance_analysis
ORDER BY month DESC;

This query chains three CTEs: monthly_sales aggregates raw data, yearly_average summarizes those results, and performance_analysis combines both to categorize each month’s performance. Each step is independently testable—comment out the final SELECT and query any intermediate CTE directly.

This approach transforms what would be an unreadable nested query into a clear data pipeline. You’re essentially writing SQL like procedural code, with each CTE representing a transformation step.

Recursive CTEs

Recursive CTEs are where SQLite’s CTE implementation becomes genuinely powerful. They consist of two parts joined by UNION ALL: an anchor member (the base case) and a recursive member (the iterative step).

The syntax requires the RECURSIVE keyword:

WITH RECURSIVE cte_name AS (
    -- Anchor member
    SELECT ...
    UNION ALL
    -- Recursive member
    SELECT ... FROM cte_name WHERE ...
)
SELECT * FROM cte_name;

Here’s a practical example traversing an organizational hierarchy:

WITH RECURSIVE employee_hierarchy AS (
    -- Anchor: Start with the CEO
    SELECT 
        employee_id,
        employee_name,
        manager_id,
        1 as level,
        employee_name as path
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- Recursive: Find direct reports
    SELECT 
        e.employee_id,
        e.employee_name,
        e.manager_id,
        eh.level + 1,
        eh.path || ' > ' || e.employee_name
    FROM employees e
    JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
    WHERE eh.level < 10  -- Prevent infinite loops
)
SELECT 
    level,
    employee_name,
    path
FROM employee_hierarchy
ORDER BY level, employee_name;

The anchor member selects top-level employees (where manager_id IS NULL). The recursive member joins back to the CTE itself, finding each employee’s direct reports. SQLite executes this iteratively until no new rows are found.

Always include a termination condition (like level < 10) to prevent infinite loops. Recursive CTEs will run until they stop producing new rows or hit SQLite’s recursion limit.

Another common use case is generating sequences:

WITH RECURSIVE date_series AS (
    SELECT date('now', '-30 days') as date
    UNION ALL
    SELECT date(date, '+1 day')
    FROM date_series
    WHERE date < date('now')
)
SELECT date FROM date_series;

This generates the last 30 days, useful for filling gaps in time-series data.

Practical Use Cases

CTEs excel at data transformation pipelines where you need multiple calculation steps. Here’s a product category tree with depth indicators:

WITH RECURSIVE category_tree AS (
    -- Root categories
    SELECT 
        category_id,
        category_name,
        parent_category_id,
        0 as depth,
        category_name as full_path,
        CAST(category_id AS TEXT) as sort_path
    FROM categories
    WHERE parent_category_id IS NULL
    
    UNION ALL
    
    -- Child categories
    SELECT 
        c.category_id,
        c.category_name,
        c.parent_category_id,
        ct.depth + 1,
        ct.full_path || ' / ' || c.category_name,
        ct.sort_path || '-' || CAST(c.category_id AS TEXT)
    FROM categories c
    JOIN category_tree ct ON c.parent_category_id = ct.category_id
),
category_stats AS (
    SELECT 
        category_id,
        COUNT(*) as product_count,
        AVG(price) as avg_price
    FROM products
    GROUP BY category_id
)
SELECT 
    ct.depth,
    SUBSTR('                    ', 1, ct.depth * 2) || ct.category_name as indented_name,
    ct.full_path,
    COALESCE(cs.product_count, 0) as products,
    COALESCE(cs.avg_price, 0) as avg_price
FROM category_tree ct
LEFT JOIN category_stats cs ON ct.category_id = cs.category_id
ORDER BY ct.sort_path;

This query builds a complete category hierarchy with indentation, full paths, and product statistics. The sort_path ensures proper tree ordering, and the LEFT JOIN handles categories without products.

Use this pattern for org charts, comment threads, file systems, or any hierarchical structure. The recursive CTE handles arbitrary depth without knowing the tree structure in advance.

Performance Considerations and Best Practices

SQLite materializes CTEs once per query execution, storing results in a temporary table. This helps when you reference the same CTE multiple times but adds overhead for simple cases where a subquery would suffice.

Compare these approaches:

-- CTE approach (materialized once)
WITH recent_orders AS (
    SELECT customer_id, SUM(total) as total_spent
    FROM orders
    WHERE order_date > date('now', '-30 days')
    GROUP BY customer_id
)
SELECT c.name, ro.total_spent
FROM customers c
JOIN recent_orders ro ON c.customer_id = ro.customer_id
WHERE ro.total_spent > 1000;

-- Subquery approach (optimized by query planner)
SELECT c.name, o.total_spent
FROM customers c
JOIN (
    SELECT customer_id, SUM(total) as total_spent
    FROM orders
    WHERE order_date > date('now', '-30 days')
    GROUP BY customer_id
) o ON c.customer_id = o.customer_id
WHERE o.total_spent > 1000;

Check the query plan:

EXPLAIN QUERY PLAN
WITH recent_orders AS (...)
SELECT ...;

Look for “MATERIALIZE” in the output. If the CTE is small or only referenced once, a subquery might perform better. If referenced multiple times or very large, materialization helps.

For performance-critical queries, test both approaches. CTEs shine when readability matters and performance is acceptable. For hot paths, optimize aggressively—sometimes that means sacrificing readability for speed.

When debugging complex CTEs, test each one independently:

WITH cte1 AS (...),
     cte2 AS (...),
     cte3 AS (...)
SELECT * FROM cte2;  -- Test intermediate results

Build your query incrementally, verifying each CTE produces expected results before adding the next layer.

CTEs don’t support indexes, so ensure your underlying tables are properly indexed. The query planner can push predicates into CTE definitions, but complex CTEs may prevent optimization. Keep CTEs focused and let the main query handle final filtering when possible.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.