How to Use HAVING in MySQL

The HAVING clause in MySQL filters grouped data after aggregation occurs. While WHERE filters individual rows before they're grouped, HAVING operates on the results of GROUP BY operations. This...

Key Insights

  • HAVING filters aggregated data after GROUP BY executes, while WHERE filters individual rows before grouping—understanding this execution order is critical for writing correct queries
  • You can only reference aggregate functions (COUNT, SUM, AVG, etc.) or grouped columns in HAVING clauses, making it essential for filtering statistical results like “customers with more than 10 orders”
  • Combining WHERE and HAVING in the same query often produces the best performance: use WHERE to reduce rows early, then HAVING to filter the aggregated results

Understanding the HAVING Clause

The HAVING clause in MySQL filters grouped data after aggregation occurs. While WHERE filters individual rows before they’re grouped, HAVING operates on the results of GROUP BY operations. This distinction isn’t just semantic—it fundamentally changes what you can filter and when that filtering happens in query execution.

Think of it this way: WHERE answers “which individual rows should I include?”, while HAVING answers “which groups of rows meet my criteria?” This makes HAVING indispensable when working with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN().

Here’s the simplest comparison:

-- WHERE filters individual rows
SELECT customer_id, order_date, amount
FROM orders
WHERE amount > 100;

-- HAVING filters grouped results
SELECT customer_id, COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

The first query returns individual orders over $100. The second query returns only customers who have placed more than 5 orders—a fundamentally different question that requires grouping and counting first.

Basic HAVING Syntax with GROUP BY

HAVING almost always appears with GROUP BY because you need aggregated data to filter. The standard structure follows this pattern:

SELECT column, AGGREGATE_FUNCTION(column)
FROM table
GROUP BY column
HAVING AGGREGATE_FUNCTION(column) condition;

Let’s look at a practical example using a customers and orders scenario:

SELECT 
    customer_id,
    COUNT(*) as total_purchases,
    SUM(order_amount) as total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

This query identifies customers who have made more than 5 purchases. The GROUP BY collects all orders per customer, COUNT(*) tallies them, and HAVING filters to only those groups where the count exceeds 5.

You can filter on any aggregate function:

SELECT 
    product_category,
    AVG(price) as average_price,
    COUNT(*) as product_count
FROM products
GROUP BY product_category
HAVING AVG(price) > 50 AND COUNT(*) >= 10;

This finds categories with an average product price over $50 that contain at least 10 products.

HAVING with Multiple Conditions

Real-world queries often require multiple filtering criteria on aggregated data. You can combine conditions using AND and OR operators, just like in WHERE clauses:

SELECT 
    product_category,
    SUM(sales_amount) as total_sales,
    AVG(unit_price) as avg_price,
    COUNT(DISTINCT product_id) as product_count
FROM sales
JOIN products ON sales.product_id = products.id
GROUP BY product_category
HAVING SUM(sales_amount) > 10000 
    AND AVG(unit_price) < 50
    AND COUNT(DISTINCT product_id) >= 5;

This query finds product categories that meet three criteria simultaneously:

  • Total sales exceeding $10,000
  • Average unit price below $50
  • At least 5 different products in the category

You can also use OR for alternative conditions:

SELECT 
    sales_rep_id,
    COUNT(*) as deals_closed,
    SUM(deal_value) as total_value
FROM deals
WHERE deal_status = 'closed'
GROUP BY sales_rep_id
HAVING COUNT(*) > 20 OR SUM(deal_value) > 100000;

This identifies high-performing sales reps who either closed more than 20 deals OR generated over $100,000 in total value.

HAVING vs WHERE: When to Use Each

Understanding the execution order is crucial for writing correct and efficient queries. MySQL processes queries in this sequence:

  1. FROM/JOIN - Determines which tables to use
  2. WHERE - Filters individual rows
  3. GROUP BY - Aggregates rows into groups
  4. HAVING - Filters groups
  5. SELECT - Determines which columns to return
  6. ORDER BY - Sorts results

Because WHERE executes before GROUP BY, you cannot use aggregate functions in WHERE clauses. Because HAVING executes after GROUP BY, you can only reference grouped columns or aggregate functions.

Here’s a query demonstrating both clauses working together:

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(order_amount) as total_spent
FROM orders
WHERE order_date >= '2024-01-01' 
    AND order_status = 'completed'
GROUP BY customer_id
HAVING SUM(order_amount) > 1000
ORDER BY total_spent DESC;

In this query:

  • WHERE filters individual orders to only completed orders from 2024
  • GROUP BY aggregates the filtered orders by customer
  • HAVING filters the grouped results to customers who spent over $1000
  • ORDER BY sorts the final results

This is more efficient than using only HAVING because WHERE reduces the dataset before the expensive GROUP BY operation occurs.

Common HAVING Use Cases

Finding Duplicate Records

One of the most practical uses for HAVING is detecting duplicates:

SELECT 
    email,
    COUNT(*) as duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

This immediately shows which email addresses appear multiple times in your users table.

Analyzing Sales Metrics

HAVING excels at business intelligence queries:

SELECT 
    DATE_FORMAT(order_date, '%Y-%m') as month,
    COUNT(*) as orders,
    SUM(order_amount) as revenue,
    AVG(order_amount) as avg_order_value
FROM orders
WHERE order_status = 'completed'
GROUP BY DATE_FORMAT(order_date, '%Y-%m')
HAVING SUM(order_amount) > 50000
ORDER BY month DESC;

This identifies high-revenue months (over $50,000) and provides key metrics for each.

Identifying Top Performers

Filter for entities that meet performance thresholds:

SELECT 
    store_id,
    COUNT(DISTINCT customer_id) as unique_customers,
    SUM(sale_amount) as total_revenue,
    AVG(sale_amount) as avg_sale
FROM sales
WHERE sale_date >= DATE_SUB(CURDATE(), INTERVAL 90 DAY)
GROUP BY store_id
HAVING COUNT(DISTINCT customer_id) >= 100
    AND AVG(sale_amount) > 75;

This finds stores with at least 100 unique customers and an average sale over $75 in the last 90 days.

Performance Considerations and Best Practices

Filter Early with WHERE

Always use WHERE to reduce rows before grouping when possible. This is more efficient:

-- BETTER: Filter before grouping
SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY customer_id
HAVING COUNT(*) > 5;

-- WORSE: All rows grouped, then filtered
SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5 
    AND MIN(order_date) >= '2024-01-01';

The first query filters millions of old orders before grouping. The second groups everything, then filters—much slower.

Index Columns Used in WHERE and GROUP BY

While you can’t index aggregate functions, you can index the columns used in WHERE and GROUP BY:

-- Add indexes to improve this query
CREATE INDEX idx_order_date ON orders(order_date);
CREATE INDEX idx_customer_status ON orders(customer_id, order_status);

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(order_amount) as total
FROM orders
WHERE order_status = 'completed'
GROUP BY customer_id
HAVING SUM(order_amount) > 1000;

Avoid Unnecessary Columns in GROUP BY

Only group by columns you actually need. Each additional column in GROUP BY creates more groups and slows the query:

-- Only group by what you need
SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

Use EXPLAIN to Analyze Query Performance

Always check your query execution plan:

EXPLAIN SELECT 
    customer_id,
    COUNT(*) as order_count
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY customer_id
HAVING COUNT(*) > 5;

Look for “Using temporary” and “Using filesort” in the Extra column—these indicate expensive operations that might benefit from indexing or query restructuring.

The HAVING clause is essential for any serious data analysis in MySQL. Master it alongside GROUP BY, and you’ll be able to answer complex business questions efficiently. Remember: WHERE filters rows, HAVING filters groups. Use WHERE to reduce data early, then HAVING to filter your aggregated results. This combination gives you both correctness and performance.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.