How to Use FILTER Clause in PostgreSQL
PostgreSQL 9.4 introduced the FILTER clause as a SQL standard feature that revolutionizes how we perform conditional aggregation. Before FILTER, developers had to resort to awkward CASE statements...
Key Insights
- The FILTER clause enables conditional aggregation directly within aggregate functions, eliminating the need for verbose CASE statements or multiple subqueries while improving query readability.
- FILTER works by applying a WHERE-like condition to individual aggregate functions rather than the entire result set, allowing different aggregates in the same query to operate on different subsets of data.
- Using FILTER can significantly improve both performance and maintainability compared to traditional approaches, especially when calculating multiple conditional metrics in a single query.
Introduction to the FILTER Clause
PostgreSQL 9.4 introduced the FILTER clause as a SQL standard feature that revolutionizes how we perform conditional aggregation. Before FILTER, developers had to resort to awkward CASE statements nested inside aggregate functions or write multiple subqueries to calculate different metrics based on different conditions. FILTER provides a cleaner, more intuitive syntax that makes your intent explicit.
The key difference between FILTER and a WHERE clause is scope. A WHERE clause filters rows before any aggregation happens, affecting all aggregate functions in your query. FILTER, on the other hand, applies conditions to individual aggregate functions independently. This means you can calculate multiple aggregates with different conditions in a single pass through your data, without the performance penalty of multiple subqueries or the verbosity of CASE-based approaches.
Basic FILTER Syntax and Simple Examples
The FILTER clause follows a straightforward syntax pattern:
aggregate_function(column) FILTER (WHERE condition)
Let’s look at a practical example using an orders table. The traditional approach using CASE statements looks like this:
SELECT
COUNT(*) as total_orders,
COUNT(CASE WHEN status = 'completed' THEN 1 END) as completed_orders,
COUNT(CASE WHEN status = 'pending' THEN 1 END) as pending_orders,
COUNT(CASE WHEN status = 'cancelled' THEN 1 END) as cancelled_orders
FROM orders;
With FILTER, this becomes much more readable:
SELECT
COUNT(*) as total_orders,
COUNT(*) FILTER (WHERE status = 'completed') as completed_orders,
COUNT(*) FILTER (WHERE status = 'pending') as pending_orders,
COUNT(*) FILTER (WHERE status = 'cancelled') as cancelled_orders
FROM orders;
The FILTER version is immediately clearer. You can see at a glance that each aggregate is counting rows with specific status values. There’s no mental overhead of parsing through CASE/WHEN/THEN/END blocks.
Practical Use Cases with Multiple Aggregates
FILTER really shines when you need to calculate multiple different metrics with varying conditions. Consider a sales dashboard that needs to show diverse statistics across different dimensions:
SELECT
-- Revenue metrics by region
SUM(amount) FILTER (WHERE region = 'North America') as na_revenue,
SUM(amount) FILTER (WHERE region = 'Europe') as eu_revenue,
SUM(amount) FILTER (WHERE region = 'Asia') as asia_revenue,
-- Customer counts by tier
COUNT(DISTINCT customer_id) FILTER (WHERE tier = 'premium') as premium_customers,
COUNT(DISTINCT customer_id) FILTER (WHERE tier = 'standard') as standard_customers,
-- Average order values by category
AVG(amount) FILTER (WHERE category = 'electronics') as avg_electronics_order,
AVG(amount) FILTER (WHERE category = 'clothing') as avg_clothing_order,
-- Conversion metrics
COUNT(*) FILTER (WHERE status = 'completed') as completed_count,
COUNT(*) FILTER (WHERE status = 'completed' AND amount > 100) as high_value_completions
FROM sales
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';
This single query replaces what would traditionally require multiple subqueries or complex CASE statements. The performance benefit is significant because PostgreSQL can calculate all these metrics in one table scan rather than multiple passes through the data.
Combining FILTER with GROUP BY
FILTER becomes even more powerful when combined with GROUP BY clauses, enabling pivot-style reports that would otherwise require complex query structures:
SELECT
DATE_TRUNC('month', order_date) as month,
-- Sales channel breakdown
SUM(amount) FILTER (WHERE channel = 'online') as online_sales,
SUM(amount) FILTER (WHERE channel = 'in_store') as in_store_sales,
SUM(amount) FILTER (WHERE channel = 'mobile_app') as mobile_sales,
-- Order counts by channel
COUNT(*) FILTER (WHERE channel = 'online') as online_orders,
COUNT(*) FILTER (WHERE channel = 'in_store') as in_store_orders,
-- Calculate channel-specific averages
AVG(amount) FILTER (WHERE channel = 'online') as avg_online_order,
AVG(amount) FILTER (WHERE channel = 'in_store') as avg_in_store_order,
-- Total for comparison
SUM(amount) as total_sales,
COUNT(*) as total_orders
FROM sales
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;
This query produces a comprehensive monthly report showing sales broken down by channel, all in a single result set. Each row represents one month with multiple conditional aggregates calculated simultaneously.
Advanced Patterns and Window Functions
FILTER also works with window functions, opening up sophisticated analytical possibilities. You can create running totals that include only specific types of transactions, or calculate percentages where the numerator and denominator have different filtering criteria:
SELECT
order_date,
customer_id,
amount,
status,
-- Running total of completed orders only
SUM(amount) FILTER (WHERE status = 'completed')
OVER (ORDER BY order_date) as running_completed_total,
-- Count of high-value orders in the last 7 days
COUNT(*) FILTER (WHERE amount > 500)
OVER (ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as recent_high_value_count,
-- Percentage of total revenue (filtered numerator, full denominator)
ROUND(100.0 *
SUM(amount) FILTER (WHERE category = 'electronics') OVER () /
SUM(amount) OVER (),
2) as electronics_pct_of_total
FROM sales
WHERE order_date >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY order_date;
You can also nest conditions for complex business logic:
SELECT
region,
COUNT(*) FILTER (
WHERE status = 'completed'
AND amount > 1000
AND customer_type = 'new'
) as high_value_new_customers,
AVG(amount) FILTER (
WHERE status = 'completed'
AND (category = 'electronics' OR category = 'appliances')
AND discount_applied = false
) as avg_full_price_tech_order
FROM sales
GROUP BY region;
Performance Considerations and Best Practices
FILTER typically outperforms equivalent CASE-based approaches because it makes your intent explicit to the query planner. However, there are still important considerations:
Indexing matters. If your FILTER conditions reference specific columns, ensure those columns are indexed appropriately. A query filtering on status repeatedly will benefit from an index on that column:
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_sales_channel_date ON sales(channel, order_date);
Choose FILTER over subqueries when calculating multiple aggregates. A single query with multiple FILTER clauses will almost always outperform multiple subqueries:
-- Slower: Multiple subqueries
SELECT
(SELECT COUNT(*) FROM orders WHERE status = 'completed') as completed,
(SELECT COUNT(*) FROM orders WHERE status = 'pending') as pending,
(SELECT AVG(amount) FROM orders WHERE status = 'completed') as avg_completed;
-- Faster: Single query with FILTER
SELECT
COUNT(*) FILTER (WHERE status = 'completed') as completed,
COUNT(*) FILTER (WHERE status = 'pending') as pending,
AVG(amount) FILTER (WHERE status = 'completed') as avg_completed
FROM orders;
Readability is a feature. Even when performance is equivalent, FILTER makes your queries easier to understand and maintain. Future developers (including yourself) will thank you.
Common Pitfalls and Troubleshooting
NULL Handling: FILTER respects NULL values in aggregates. A COUNT with FILTER that matches no rows returns 0, but SUM returns NULL:
-- Incorrect: Might get NULL instead of 0
SELECT SUM(amount) FILTER (WHERE status = 'nonexistent_status') as total;
-- Correct: Use COALESCE for guaranteed numeric result
SELECT COALESCE(SUM(amount) FILTER (WHERE status = 'nonexistent_status'), 0) as total;
FILTER vs WHERE confusion: Remember that WHERE filters the entire result set before aggregation, while FILTER applies to individual aggregates:
-- Wrong: WHERE filters all rows, then counts what remains
SELECT COUNT(*) FILTER (WHERE status = 'completed')
FROM orders
WHERE status = 'pending'; -- This returns 0!
-- Right: No WHERE clause, let FILTER do the work
SELECT
COUNT(*) FILTER (WHERE status = 'completed') as completed,
COUNT(*) FILTER (WHERE status = 'pending') as pending
FROM orders;
Version compatibility: FILTER requires PostgreSQL 9.4 or later. If you’re on an older version, you’ll need to use CASE statements. Check your version with:
SELECT version();
The FILTER clause is one of PostgreSQL’s most underutilized features. It transforms conditional aggregation from a verbose chore into an elegant, performant operation. Once you start using FILTER, you’ll wonder how you ever lived without it. Start incorporating it into your queries today, and watch your SQL become both more readable and more efficient.