How to Use AVG in MySQL
The AVG function calculates the arithmetic mean of a set of values in MySQL. It sums all non-NULL values in a column and divides by the count of those values. This makes it indispensable for data...
Key Insights
- The AVG function automatically ignores NULL values, which can lead to unexpected results if you’re not aware—always check your data’s NULL distribution before relying on averages
- Combining AVG with GROUP BY and HAVING lets you calculate category-specific averages and filter groups based on those averages, enabling powerful analytical queries
- Integer columns can cause precision loss with AVG—cast to DECIMAL or use ROUND() to control decimal places and avoid truncation issues
Understanding the AVG Function
The AVG function calculates the arithmetic mean of a set of values in MySQL. It sums all non-NULL values in a column and divides by the count of those values. This makes it indispensable for data analysis, reporting dashboards, and business intelligence queries where you need to understand central tendencies in your data.
The basic syntax is straightforward:
SELECT AVG(column_name) FROM table_name;
Here’s a simple example calculating the average price from a products table:
SELECT AVG(price) AS average_price
FROM products;
This returns a single value representing the mean price across all products. The AS average_price alias makes the result set more readable—use aliases consistently in production queries.
Working with Basic AVG Queries
The AVG function becomes more useful when combined with WHERE clauses to calculate conditional averages. Here’s how to find the average price of products in a specific category:
SELECT AVG(price) AS avg_electronics_price
FROM products
WHERE category = 'Electronics';
Understanding how AVG handles NULL values is critical. Consider this employee salary table:
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
INSERT INTO employees VALUES
(1, 'Alice', 'Engineering', 95000.00),
(2, 'Bob', 'Engineering', 87000.00),
(3, 'Carol', 'Marketing', NULL),
(4, 'David', 'Marketing', 72000.00);
When you calculate the average salary:
SELECT AVG(salary) AS average_salary
FROM employees;
MySQL ignores Carol’s NULL salary entirely. The result is (95000 + 87000 + 72000) / 3 = 84666.67, not divided by 4. This behavior is usually what you want, but if you need to include NULLs as zeros, use COALESCE:
SELECT AVG(COALESCE(salary, 0)) AS average_including_nulls
FROM employees;
Grouping Data with AVG and GROUP BY
The real power of AVG emerges when calculating averages for different categories using GROUP BY. This lets you compare metrics across segments:
SELECT
department,
AVG(salary) AS avg_salary,
COUNT(*) AS employee_count
FROM employees
GROUP BY department;
This returns one row per department with its average salary. Always include non-aggregated columns in your GROUP BY clause—MySQL’s ONLY_FULL_GROUP_BY mode will reject queries that don’t follow this rule.
The HAVING clause filters groups based on aggregate values, unlike WHERE which filters individual rows before aggregation:
SELECT
department,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000;
This returns only departments where the average salary exceeds $80,000. You can’t use WHERE for this because the average doesn’t exist until after grouping.
Here’s a more complex example with sales data:
SELECT
region,
YEAR(sale_date) AS sale_year,
AVG(sale_amount) AS avg_sale,
COUNT(*) AS total_sales
FROM sales
WHERE sale_date >= '2023-01-01'
GROUP BY region, YEAR(sale_date)
HAVING AVG(sale_amount) > 1000
ORDER BY region, sale_year;
This calculates average sales by region and year, filtering for recent sales and excluding region-year combinations with low averages.
Combining AVG with JOINs
Real-world databases spread data across multiple tables. Use JOINs to calculate averages across related data:
SELECT
c.customer_name,
AVG(o.order_total) AS avg_order_value,
COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) >= 3;
This finds the average order value for customers with at least three orders. The INNER JOIN ensures you only include customers who have placed orders.
LEFT JOIN behaves differently—it includes all rows from the left table even without matches:
SELECT
c.customer_name,
AVG(o.order_total) AS avg_order_value,
COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name;
Customers without orders will have NULL for avg_order_value (since AVG of no rows returns NULL) and 0 for order_count.
Here’s a three-table join calculating average product ratings with category information:
SELECT
cat.category_name,
AVG(r.rating) AS avg_rating,
COUNT(r.rating_id) AS rating_count
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN ratings r ON p.product_id = r.product_id
GROUP BY cat.category_id, cat.category_name
HAVING COUNT(r.rating_id) >= 10;
This shows average ratings by category, but only for categories with at least 10 ratings.
Advanced AVG Techniques
Use AVG with DISTINCT to calculate the average of unique values only:
SELECT
AVG(DISTINCT price) AS avg_unique_prices
FROM products;
If you have duplicate prices, this counts each unique price only once. This is useful when analyzing price points rather than all products.
Control decimal precision with ROUND:
SELECT
department,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department;
Combine multiple aggregate functions for comprehensive analysis:
SELECT
department,
COUNT(*) AS employee_count,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary,
MAX(salary) - MIN(salary) AS salary_range
FROM employees
GROUP BY department;
Use subqueries to compare individual values against the average:
SELECT
name,
salary,
(SELECT AVG(salary) FROM employees) AS company_avg,
salary - (SELECT AVG(salary) FROM employees) AS difference_from_avg
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees)
ORDER BY salary DESC;
This identifies employees earning above the company average and shows by how much. For better performance with large datasets, use a JOIN with a derived table instead:
SELECT
e.name,
e.salary,
avg_table.company_avg,
e.salary - avg_table.company_avg AS difference_from_avg
FROM employees e
CROSS JOIN (
SELECT AVG(salary) AS company_avg
FROM employees
) AS avg_table
WHERE e.salary > avg_table.company_avg
ORDER BY e.salary DESC;
Common Pitfalls and Best Practices
Integer division causes precision loss. If you’re averaging integer columns, MySQL returns an integer result by default in older versions:
-- Potential precision loss
SELECT AVG(quantity) FROM orders;
-- Better: ensure decimal result
SELECT AVG(quantity * 1.0) FROM orders;
-- Best: use proper column types
ALTER TABLE orders MODIFY quantity DECIMAL(10,2);
Always define numeric columns that might need decimal precision as DECIMAL or FLOAT, not INT.
Index columns used in WHERE clauses and GROUP BY for better performance:
CREATE INDEX idx_department ON employees(department);
CREATE INDEX idx_sale_date ON sales(sale_date);
This dramatically speeds up queries on large tables. AVG itself can’t be indexed, but the columns you filter and group by can be.
Forgetting GROUP BY columns causes errors in strict mode:
-- Wrong: name isn't in GROUP BY
SELECT name, department, AVG(salary)
FROM employees
GROUP BY department;
-- Correct: include all non-aggregated columns
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
If you need the name, use a subquery or window function instead.
Watch out for empty result sets—AVG returns NULL when no rows match:
SELECT COALESCE(AVG(salary), 0) AS avg_salary
FROM employees
WHERE department = 'NonexistentDept';
Use COALESCE to provide a default value when AVG returns NULL.
Finally, be mindful of outliers skewing your averages. Consider using MEDIAN (available via window functions in MySQL 8.0+) or calculating both average and median for a complete picture:
SELECT
AVG(salary) AS mean_salary,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) OVER() AS median_salary
FROM employees;
The AVG function is fundamental to MySQL analytics, but using it effectively requires understanding its NULL handling, combining it with GROUP BY and JOINs, and avoiding common performance and precision pitfalls.