SQL - Subquery in FROM Clause (Derived Table)

Key Insights

Derived tables are subqueries in the FROM clause that create temporary, inline result sets—they must always have an alias and exist only for the duration of the outer query.
Use derived tables to pre-aggregate data before joins, break complex logic into digestible steps, or filter on calculated values without resorting to HAVING clauses.
Derived tables and CTEs typically produce identical execution plans, so choose based on readability; use derived tables for simple, one-off transformations and CTEs when you need to reference the same result set multiple times.

Introduction to Derived Tables

When you write a SQL query, the FROM clause typically references physical tables or views. But SQL allows something more powerful: you can place an entire subquery in the FROM clause, creating what’s called a derived table (also known as an inline view or subquery table).

A derived table is a temporary result set that exists only during query execution. Unlike temporary tables or views, it has no persistent storage. The database engine executes the subquery, holds the results in memory, and makes them available to the outer query as if they were a regular table.

This technique transforms how you approach complex queries. Instead of cramming everything into a single SELECT statement with convoluted WHERE clauses and multiple GROUP BY expressions, you can build your query in logical layers. Each derived table handles one piece of the puzzle.

Basic Syntax and Structure

The syntax is straightforward but has strict requirements:

SELECT outer_columns
FROM (
    SELECT inner_columns
    FROM some_table
    WHERE some_condition
) AS alias_name
WHERE outer_condition;

Two rules are non-negotiable:

Parentheses are mandatory. The subquery must be enclosed in parentheses.
An alias is required. Every derived table needs a name. Without it, you’ll get a syntax error.

The alias creates a namespace for the derived table’s columns. You reference them as alias_name.column_name in the outer query.

Here’s a practical example that aggregates order data:

SELECT 
    order_summary.customer_id,
    order_summary.total_orders,
    order_summary.total_spent
FROM (
    SELECT 
        customer_id,
        COUNT(*) AS total_orders,
        SUM(order_total) AS total_spent
    FROM orders
    WHERE order_date >= '2024-01-01'
    GROUP BY customer_id
) AS order_summary
WHERE order_summary.total_spent > 1000;

The inner query aggregates orders by customer. The outer query filters to high-value customers. Notice how the derived table’s alias (order_summary) makes the outer query readable and self-documenting.

Scope matters here. Columns from the derived table are visible to the outer query, but the outer query’s columns aren’t visible inside the derived table. This one-way visibility is intentional—it keeps the subquery self-contained and predictable.

Common Use Cases

Derived tables solve several recurring problems elegantly.

Pre-Aggregating Data Before Joins

When you need to join aggregated data back to detail records, derived tables keep your logic clean:

SELECT 
    e.employee_id,
    e.employee_name,
    e.salary,
    dept_avg.avg_salary,
    e.salary - dept_avg.avg_salary AS variance_from_avg
FROM employees e
INNER JOIN (
    SELECT 
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) AS dept_avg ON e.department_id = dept_avg.department_id
WHERE e.salary > dept_avg.avg_salary;

This query finds employees earning above their department’s average. The derived table calculates department averages, then the outer query joins back to individual employees. Trying to do this without a derived table would require correlated subqueries or window functions—both less intuitive for this use case.

Filtering on Aggregated Values

Sometimes you need to filter based on an aggregate, but the condition doesn’t fit naturally in a HAVING clause:

SELECT 
    product_stats.product_id,
    product_stats.product_name,
    product_stats.units_sold
FROM (
    SELECT 
        p.product_id,
        p.product_name,
        SUM(oi.quantity) AS units_sold
    FROM products p
    LEFT JOIN order_items oi ON p.product_id = oi.product_id
    GROUP BY p.product_id, p.product_name
) AS product_stats
WHERE product_stats.units_sold BETWEEN 100 AND 500
   OR product_stats.units_sold IS NULL;

Top-N Per Group

Finding the top records within each group is a classic derived table application:

SELECT 
    ranked_sales.region,
    ranked_sales.salesperson,
    ranked_sales.total_sales
FROM (
    SELECT 
        region,
        salesperson,
        total_sales,
        ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_sales DESC) AS rank
    FROM sales_summary
) AS ranked_sales
WHERE ranked_sales.rank <= 3;

The inner query assigns rankings. The outer query filters to the top 3 per region. Clean and efficient.

Derived Tables vs. CTEs

Common Table Expressions (CTEs) offer an alternative syntax for the same concept. Here’s the department average query rewritten as a CTE:

-- CTE version
WITH dept_avg AS (
    SELECT 
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT 
    e.employee_id,
    e.employee_name,
    e.salary,
    dept_avg.avg_salary
FROM employees e
INNER JOIN dept_avg ON e.department_id = dept_avg.department_id
WHERE e.salary > dept_avg.avg_salary;

-- Derived table version (same logic)
SELECT 
    e.employee_id,
    e.employee_name,
    e.salary,
    dept_avg.avg_salary
FROM employees e
INNER JOIN (
    SELECT 
        department_id,
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) AS dept_avg ON e.department_id = dept_avg.department_id
WHERE e.salary > dept_avg.avg_salary;

When to choose CTEs:

You need to reference the same result set multiple times
The query has many layers of nesting (CTEs flatten the structure)
You want to name intermediate steps for documentation purposes

When to choose derived tables:

Simple, one-off transformations
You prefer seeing the subquery inline with where it’s used
You’re working with older database systems that don’t support CTEs

Performance is typically identical. Modern query optimizers treat both constructs the same way—they analyze the logical intent and generate equivalent execution plans. Don’t choose based on perceived performance differences; choose based on readability.

Performance Considerations

Understanding how databases execute derived tables helps you write efficient queries.

Merging vs. Materialization

The optimizer has two strategies:

Merging: The optimizer “flattens” the derived table, incorporating its logic directly into the outer query. This allows better optimization across the entire query.
Materialization: The optimizer executes the derived table first, stores results in a temporary structure, then processes the outer query against that structure.

Most optimizers prefer merging when possible. Materialization happens when merging isn’t feasible—typically with DISTINCT, GROUP BY, LIMIT, or window functions in the derived table.

You can examine the optimizer’s choice using EXPLAIN:

EXPLAIN ANALYZE
SELECT 
    customer_summary.customer_id,
    customer_summary.order_count
FROM (
    SELECT 
        customer_id,
        COUNT(*) AS order_count
    FROM orders
    GROUP BY customer_id
) AS customer_summary
WHERE customer_summary.order_count > 10;

Look for terms like “Materialize” or “Subquery Scan” in the output to understand how your derived table is being processed.

Index Usage

Derived tables don’t have indexes. If the outer query needs to filter or join on derived table columns, the database must scan the entire result set. Keep derived tables as small as possible by:

Filtering aggressively in the inner query
Selecting only needed columns
Applying LIMIT when appropriate

Practical Example: Multi-Level Aggregation

Real-world reporting often requires aggregating aggregates. Consider this scenario: you need the average monthly sales per region, but your data is at the transaction level.

SELECT 
    regional_monthly.region,
    ROUND(AVG(regional_monthly.monthly_total), 2) AS avg_monthly_sales,
    MIN(regional_monthly.monthly_total) AS lowest_month,
    MAX(regional_monthly.monthly_total) AS highest_month
FROM (
    SELECT 
        s.region,
        DATE_TRUNC('month', s.sale_date) AS sale_month,
        SUM(s.amount) AS monthly_total
    FROM sales s
    WHERE s.sale_date >= '2024-01-01'
      AND s.sale_date < '2025-01-01'
    GROUP BY s.region, DATE_TRUNC('month', s.sale_date)
) AS regional_monthly
GROUP BY regional_monthly.region
ORDER BY avg_monthly_sales DESC;

The inner derived table aggregates daily sales into monthly totals by region. The outer query then calculates statistics across those monthly figures. Without this two-stage approach, you’d be stuck trying to express “average of sums” in a single GROUP BY—which isn’t possible.

For even more complex scenarios, you can nest derived tables:

SELECT 
    yearly_avg.region,
    yearly_avg.year,
    yearly_avg.avg_monthly_sales
FROM (
    SELECT 
        regional_monthly.region,
        EXTRACT(YEAR FROM regional_monthly.sale_month) AS year,
        AVG(regional_monthly.monthly_total) AS avg_monthly_sales
    FROM (
        SELECT 
            region,
            DATE_TRUNC('month', sale_date) AS sale_month,
            SUM(amount) AS monthly_total
        FROM sales
        GROUP BY region, DATE_TRUNC('month', sale_date)
    ) AS regional_monthly
    GROUP BY regional_monthly.region, EXTRACT(YEAR FROM regional_monthly.sale_month)
) AS yearly_avg
WHERE yearly_avg.avg_monthly_sales > 50000;

This calculates average monthly sales per region per year, then filters to high-performing combinations. At this nesting level, consider switching to CTEs for readability.

Summary and Best Practices

Derived tables are a fundamental SQL technique that every developer should master. They let you decompose complex queries into manageable pieces, pre-aggregate data for joins, and express multi-level calculations cleanly.

Remember these guidelines:

Always provide an alias—it’s not optional
Keep derived tables focused on a single transformation
Filter early and select only needed columns
Switch to CTEs when nesting exceeds two levels or when you need to reuse results
Use EXPLAIN to verify the optimizer handles your derived table efficiently

Derived tables won’t solve every problem. For truly complex transformations, temporary tables give you indexes and statistics. For reusable logic, views provide persistence. But for inline, query-specific transformations, derived tables hit the sweet spot between power and simplicity.