How to Use HAVING in SQLite

The HAVING clause is SQLite's mechanism for filtering grouped data after aggregation. This is fundamentally different from WHERE, which filters individual rows before any grouping occurs....

Key Insights

  • HAVING filters aggregated data after GROUP BY executes, while WHERE filters individual rows before grouping—using the wrong one produces incorrect results or errors
  • Every column in HAVING must either be part of an aggregate function (COUNT, SUM, AVG, etc.) or appear in the GROUP BY clause
  • Combining WHERE and HAVING optimally improves performance: use WHERE to reduce rows before expensive grouping operations, then HAVING to filter the aggregated results

Understanding HAVING vs WHERE

The HAVING clause is SQLite’s mechanism for filtering grouped data after aggregation. This is fundamentally different from WHERE, which filters individual rows before any grouping occurs. Understanding this distinction is critical—using WHERE when you need HAVING will produce errors, while using HAVING when WHERE would suffice wastes computational resources.

Consider this comparison:

-- WHERE filters rows BEFORE grouping
SELECT city, COUNT(*) as customer_count
FROM customers
WHERE registration_date > '2024-01-01'
GROUP BY city;

-- HAVING filters groups AFTER aggregation
SELECT city, COUNT(*) as customer_count
FROM customers
GROUP BY city
HAVING COUNT(*) > 5;

-- This produces an error - WHERE cannot use aggregate functions
SELECT city, COUNT(*) as customer_count
FROM customers
WHERE COUNT(*) > 5  -- ERROR!
GROUP BY city;

The first query filters customers by registration date, then groups them. The second query groups all customers, then filters to show only cities with more than 5 customers. The third query fails because WHERE evaluates before grouping—there’s no aggregate value to compare yet.

Basic HAVING Syntax and Structure

The HAVING clause has a specific position in the SELECT statement order:

SELECT columns
FROM table
WHERE row_conditions
GROUP BY grouping_columns
HAVING group_conditions
ORDER BY sort_columns
LIMIT row_count;

HAVING always comes after GROUP BY and before ORDER BY. Here’s a practical example:

-- Find cities with significant customer bases
SELECT 
    city,
    COUNT(*) as total_customers,
    COUNT(DISTINCT company) as unique_companies
FROM customers
GROUP BY city
HAVING COUNT(*) > 5
ORDER BY total_customers DESC;

This query groups customers by city, counts them, then filters out cities with 5 or fewer customers. The HAVING clause ensures we only see cities meeting our threshold.

HAVING with Aggregate Functions

HAVING truly shines when working with aggregate functions. Each function serves different analytical purposes:

-- COUNT: Find products with high order frequency
SELECT 
    product_id,
    product_name,
    COUNT(*) as order_count
FROM orders
JOIN order_items USING (order_id)
JOIN products USING (product_id)
GROUP BY product_id, product_name
HAVING COUNT(*) >= 100;

-- SUM: Identify high-revenue products
SELECT 
    product_id,
    product_name,
    SUM(quantity * unit_price) as total_revenue
FROM order_items
JOIN products USING (product_id)
GROUP BY product_id, product_name
HAVING SUM(quantity * unit_price) > 10000
ORDER BY total_revenue DESC;

-- AVG: Find departments with above-average compensation
SELECT 
    department,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 75000;

Each example demonstrates a different business question answered through aggregation and filtering. The SUM example is particularly useful for sales analysis, while the AVG example helps identify compensation patterns across organizational units.

Combining WHERE and HAVING

The real power emerges when you combine WHERE and HAVING strategically. Use WHERE to reduce the dataset before the expensive grouping operation, then HAVING to filter the aggregated results:

-- Analyze 2024 customer purchasing patterns
SELECT 
    customer_id,
    customer_name,
    COUNT(*) as order_count,
    SUM(order_total) as total_spent,
    AVG(order_total) as avg_order_value
FROM customers
JOIN orders USING (customer_id)
WHERE 
    order_date >= '2024-01-01' 
    AND order_date < '2025-01-01'
    AND order_status = 'completed'
GROUP BY customer_id, customer_name
HAVING 
    SUM(order_total) > 500
    AND COUNT(*) >= 3
ORDER BY total_spent DESC;

This query first filters to completed orders from 2024 (WHERE), then groups by customer, and finally shows only customers who spent over $500 across at least 3 orders (HAVING). The WHERE clause reduces the dataset significantly before grouping, improving performance.

Here’s another practical example for inventory management:

-- Find slow-moving products that need attention
SELECT 
    p.category,
    p.product_name,
    COUNT(oi.order_id) as times_ordered,
    SUM(oi.quantity) as total_units_sold,
    p.stock_quantity
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE 
    p.discontinued = 0
    AND p.stock_quantity > 0
GROUP BY p.product_id, p.category, p.product_name, p.stock_quantity
HAVING 
    COUNT(oi.order_id) < 5
    OR COUNT(oi.order_id) IS NULL
ORDER BY p.stock_quantity DESC;

Complex HAVING Conditions

HAVING supports complex logical conditions using AND, OR, and NOT operators, plus expressions and calculations:

-- Multi-condition analysis: profitable, high-volume customers
SELECT 
    customer_id,
    customer_name,
    COUNT(*) as order_count,
    SUM(order_total) as revenue,
    SUM(order_total - cost) as profit,
    AVG(order_total) as avg_order
FROM customers
JOIN orders USING (customer_id)
GROUP BY customer_id, customer_name
HAVING 
    COUNT(*) > 10
    AND SUM(order_total - cost) > 1000
    AND AVG(order_total) > 50;

-- Using expressions in HAVING
SELECT 
    product_category,
    COUNT(*) as product_count,
    SUM(units_sold) as total_units,
    SUM(units_sold * price) as revenue
FROM products
GROUP BY product_category
HAVING 
    SUM(units_sold * price) / COUNT(*) > 5000  -- Avg revenue per product
    OR SUM(units_sold) > 10000;

-- HAVING with subqueries for comparative analysis
SELECT 
    department,
    AVG(salary) as dept_avg_salary,
    COUNT(*) as employee_count
FROM employees
GROUP BY department
HAVING AVG(salary) > (
    SELECT AVG(salary) * 1.1 
    FROM employees
)
ORDER BY dept_avg_salary DESC;

The subquery example identifies departments where average salary exceeds the company average by 10% or more. This pattern is invaluable for comparative analysis across groups.

Common Pitfalls and Best Practices

The most frequent mistake is referencing non-aggregated, non-grouped columns in HAVING:

-- INCORRECT: status is not aggregated or grouped
SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING status = 'completed';  -- ERROR!

-- CORRECT: Move row-level filter to WHERE
SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
WHERE status = 'completed'
GROUP BY customer_id;

-- CORRECT: If you need to filter on aggregated status counts
SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed_count
FROM orders
GROUP BY customer_id
HAVING SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) > 5;

Performance considerations matter. Always filter as early as possible:

-- INEFFICIENT: Processes all historical data before filtering
SELECT 
    product_id,
    SUM(quantity) as total_sold
FROM order_items
GROUP BY product_id
HAVING product_id IN (SELECT product_id FROM active_products);

-- EFFICIENT: Filters to active products first
SELECT 
    oi.product_id,
    SUM(oi.quantity) as total_sold
FROM order_items oi
WHERE oi.product_id IN (SELECT product_id FROM active_products)
GROUP BY oi.product_id;

Another best practice: use meaningful aliases for aggregated columns to improve query readability:

SELECT 
    category,
    COUNT(*) as product_count,
    AVG(price) as avg_price,
    SUM(inventory_value) as total_inventory_value
FROM products
GROUP BY category
HAVING 
    product_count > 10  -- Clear and readable
    AND avg_price < 100
    AND total_inventory_value > 50000;

When debugging HAVING clauses, temporarily remove the HAVING condition and examine the grouped results. This helps verify your aggregations are correct before applying filters.

The HAVING clause is essential for analytical queries in SQLite. Master the distinction between WHERE and HAVING, understand proper column references in grouped queries, and combine both clauses strategically for optimal performance. Your analytical queries will be both correct and efficient.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.