How to Use FILTER Clause in PostgreSQL

PostgreSQL 9.4 introduced the FILTER clause as a SQL standard feature that revolutionizes how we perform conditional aggregation. Before FILTER, developers had to resort to awkward CASE statements...

Key Insights

  • The FILTER clause enables conditional aggregation directly within aggregate functions, eliminating the need for verbose CASE statements or multiple subqueries while improving query readability.
  • FILTER works by applying a WHERE-like condition to individual aggregate functions rather than the entire result set, allowing different aggregates in the same query to operate on different subsets of data.
  • Using FILTER can significantly improve both performance and maintainability compared to traditional approaches, especially when calculating multiple conditional metrics in a single query.

Introduction to the FILTER Clause

PostgreSQL 9.4 introduced the FILTER clause as a SQL standard feature that revolutionizes how we perform conditional aggregation. Before FILTER, developers had to resort to awkward CASE statements nested inside aggregate functions or write multiple subqueries to calculate different metrics based on different conditions. FILTER provides a cleaner, more intuitive syntax that makes your intent explicit.

The key difference between FILTER and a WHERE clause is scope. A WHERE clause filters rows before any aggregation happens, affecting all aggregate functions in your query. FILTER, on the other hand, applies conditions to individual aggregate functions independently. This means you can calculate multiple aggregates with different conditions in a single pass through your data, without the performance penalty of multiple subqueries or the verbosity of CASE-based approaches.

Basic FILTER Syntax and Simple Examples

The FILTER clause follows a straightforward syntax pattern:

aggregate_function(column) FILTER (WHERE condition)

Let’s look at a practical example using an orders table. The traditional approach using CASE statements looks like this:

SELECT 
    COUNT(*) as total_orders,
    COUNT(CASE WHEN status = 'completed' THEN 1 END) as completed_orders,
    COUNT(CASE WHEN status = 'pending' THEN 1 END) as pending_orders,
    COUNT(CASE WHEN status = 'cancelled' THEN 1 END) as cancelled_orders
FROM orders;

With FILTER, this becomes much more readable:

SELECT 
    COUNT(*) as total_orders,
    COUNT(*) FILTER (WHERE status = 'completed') as completed_orders,
    COUNT(*) FILTER (WHERE status = 'pending') as pending_orders,
    COUNT(*) FILTER (WHERE status = 'cancelled') as cancelled_orders
FROM orders;

The FILTER version is immediately clearer. You can see at a glance that each aggregate is counting rows with specific status values. There’s no mental overhead of parsing through CASE/WHEN/THEN/END blocks.

Practical Use Cases with Multiple Aggregates

FILTER really shines when you need to calculate multiple different metrics with varying conditions. Consider a sales dashboard that needs to show diverse statistics across different dimensions:

SELECT 
    -- Revenue metrics by region
    SUM(amount) FILTER (WHERE region = 'North America') as na_revenue,
    SUM(amount) FILTER (WHERE region = 'Europe') as eu_revenue,
    SUM(amount) FILTER (WHERE region = 'Asia') as asia_revenue,
    
    -- Customer counts by tier
    COUNT(DISTINCT customer_id) FILTER (WHERE tier = 'premium') as premium_customers,
    COUNT(DISTINCT customer_id) FILTER (WHERE tier = 'standard') as standard_customers,
    
    -- Average order values by category
    AVG(amount) FILTER (WHERE category = 'electronics') as avg_electronics_order,
    AVG(amount) FILTER (WHERE category = 'clothing') as avg_clothing_order,
    
    -- Conversion metrics
    COUNT(*) FILTER (WHERE status = 'completed') as completed_count,
    COUNT(*) FILTER (WHERE status = 'completed' AND amount > 100) as high_value_completions
FROM sales
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';

This single query replaces what would traditionally require multiple subqueries or complex CASE statements. The performance benefit is significant because PostgreSQL can calculate all these metrics in one table scan rather than multiple passes through the data.

Combining FILTER with GROUP BY

FILTER becomes even more powerful when combined with GROUP BY clauses, enabling pivot-style reports that would otherwise require complex query structures:

SELECT 
    DATE_TRUNC('month', order_date) as month,
    
    -- Sales channel breakdown
    SUM(amount) FILTER (WHERE channel = 'online') as online_sales,
    SUM(amount) FILTER (WHERE channel = 'in_store') as in_store_sales,
    SUM(amount) FILTER (WHERE channel = 'mobile_app') as mobile_sales,
    
    -- Order counts by channel
    COUNT(*) FILTER (WHERE channel = 'online') as online_orders,
    COUNT(*) FILTER (WHERE channel = 'in_store') as in_store_orders,
    
    -- Calculate channel-specific averages
    AVG(amount) FILTER (WHERE channel = 'online') as avg_online_order,
    AVG(amount) FILTER (WHERE channel = 'in_store') as avg_in_store_order,
    
    -- Total for comparison
    SUM(amount) as total_sales,
    COUNT(*) as total_orders
FROM sales
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

This query produces a comprehensive monthly report showing sales broken down by channel, all in a single result set. Each row represents one month with multiple conditional aggregates calculated simultaneously.

Advanced Patterns and Window Functions

FILTER also works with window functions, opening up sophisticated analytical possibilities. You can create running totals that include only specific types of transactions, or calculate percentages where the numerator and denominator have different filtering criteria:

SELECT 
    order_date,
    customer_id,
    amount,
    status,
    
    -- Running total of completed orders only
    SUM(amount) FILTER (WHERE status = 'completed') 
        OVER (ORDER BY order_date) as running_completed_total,
    
    -- Count of high-value orders in the last 7 days
    COUNT(*) FILTER (WHERE amount > 500) 
        OVER (ORDER BY order_date 
              ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as recent_high_value_count,
    
    -- Percentage of total revenue (filtered numerator, full denominator)
    ROUND(100.0 * 
        SUM(amount) FILTER (WHERE category = 'electronics') OVER () / 
        SUM(amount) OVER (), 
    2) as electronics_pct_of_total
FROM sales
WHERE order_date >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY order_date;

You can also nest conditions for complex business logic:

SELECT 
    region,
    COUNT(*) FILTER (
        WHERE status = 'completed' 
        AND amount > 1000 
        AND customer_type = 'new'
    ) as high_value_new_customers,
    
    AVG(amount) FILTER (
        WHERE status = 'completed' 
        AND (category = 'electronics' OR category = 'appliances')
        AND discount_applied = false
    ) as avg_full_price_tech_order
FROM sales
GROUP BY region;

Performance Considerations and Best Practices

FILTER typically outperforms equivalent CASE-based approaches because it makes your intent explicit to the query planner. However, there are still important considerations:

Indexing matters. If your FILTER conditions reference specific columns, ensure those columns are indexed appropriately. A query filtering on status repeatedly will benefit from an index on that column:

CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_sales_channel_date ON sales(channel, order_date);

Choose FILTER over subqueries when calculating multiple aggregates. A single query with multiple FILTER clauses will almost always outperform multiple subqueries:

-- Slower: Multiple subqueries
SELECT 
    (SELECT COUNT(*) FROM orders WHERE status = 'completed') as completed,
    (SELECT COUNT(*) FROM orders WHERE status = 'pending') as pending,
    (SELECT AVG(amount) FROM orders WHERE status = 'completed') as avg_completed;

-- Faster: Single query with FILTER
SELECT 
    COUNT(*) FILTER (WHERE status = 'completed') as completed,
    COUNT(*) FILTER (WHERE status = 'pending') as pending,
    AVG(amount) FILTER (WHERE status = 'completed') as avg_completed
FROM orders;

Readability is a feature. Even when performance is equivalent, FILTER makes your queries easier to understand and maintain. Future developers (including yourself) will thank you.

Common Pitfalls and Troubleshooting

NULL Handling: FILTER respects NULL values in aggregates. A COUNT with FILTER that matches no rows returns 0, but SUM returns NULL:

-- Incorrect: Might get NULL instead of 0
SELECT SUM(amount) FILTER (WHERE status = 'nonexistent_status') as total;

-- Correct: Use COALESCE for guaranteed numeric result
SELECT COALESCE(SUM(amount) FILTER (WHERE status = 'nonexistent_status'), 0) as total;

FILTER vs WHERE confusion: Remember that WHERE filters the entire result set before aggregation, while FILTER applies to individual aggregates:

-- Wrong: WHERE filters all rows, then counts what remains
SELECT COUNT(*) FILTER (WHERE status = 'completed')
FROM orders
WHERE status = 'pending';  -- This returns 0!

-- Right: No WHERE clause, let FILTER do the work
SELECT 
    COUNT(*) FILTER (WHERE status = 'completed') as completed,
    COUNT(*) FILTER (WHERE status = 'pending') as pending
FROM orders;

Version compatibility: FILTER requires PostgreSQL 9.4 or later. If you’re on an older version, you’ll need to use CASE statements. Check your version with:

SELECT version();

The FILTER clause is one of PostgreSQL’s most underutilized features. It transforms conditional aggregation from a verbose chore into an elegant, performant operation. Once you start using FILTER, you’ll wonder how you ever lived without it. Start incorporating it into your queries today, and watch your SQL become both more readable and more efficient.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.