How to Use Aggregate Functions in SQLite
Aggregate functions are SQLite's workhorses for data analysis. They take a set of rows as input and return a single computed value. Instead of processing data row-by-row in your application code, you...
Key Insights
- SQLite’s aggregate functions (COUNT, SUM, AVG, MIN, MAX) transform multiple rows into single calculated values, making them essential for reporting and data analysis tasks
- The GROUP BY clause partitions data into subsets before aggregation, while HAVING filters those aggregated results—understanding this distinction prevents common SQL errors
- Advanced functions like GROUP_CONCAT and TOTAL offer SQLite-specific capabilities that handle edge cases better than standard SQL aggregates, particularly with NULL values and string concatenation
Introduction to Aggregate Functions
Aggregate functions are SQLite’s workhorses for data analysis. They take a set of rows as input and return a single computed value. Instead of processing data row-by-row in your application code, you push the computation to the database where it belongs—closer to the data and significantly faster.
Every time you need to answer questions like “How many?”, “What’s the total?”, or “What’s the average?”, you’re dealing with aggregate functions. They’re fundamental to reporting, analytics, and understanding your data at scale. SQLite implements all standard SQL aggregate functions plus a few useful extensions.
Basic Aggregate Functions
Let’s start with the five core aggregate functions you’ll use constantly.
COUNT() tells you how many rows match your criteria:
-- Count all orders
SELECT COUNT(*) FROM orders;
-- Count orders with tracking numbers (excludes NULLs)
SELECT COUNT(tracking_number) FROM orders;
-- Count distinct customers who placed orders
SELECT COUNT(DISTINCT customer_id) FROM orders;
The distinction matters: COUNT(*) counts all rows, while COUNT(column_name) ignores NULL values. This becomes critical when analyzing incomplete data.
SUM() adds up numeric values:
-- Calculate total revenue
SELECT SUM(amount) FROM orders;
-- Total quantity of products sold
SELECT SUM(quantity) FROM order_items;
SUM returns NULL if all values are NULL, not zero. Keep this in mind when displaying results.
AVG() computes the arithmetic mean:
-- Average order value
SELECT AVG(amount) FROM orders;
-- Average product price
SELECT AVG(price) FROM products;
AVG automatically excludes NULL values from the calculation, which usually makes sense but can surprise you if you’re not expecting it.
MIN() and MAX() find extreme values:
-- Price range
SELECT MIN(price) as lowest_price,
MAX(price) as highest_price
FROM products;
-- Date range of orders
SELECT MIN(created_at) as first_order,
MAX(created_at) as latest_order
FROM orders;
These work on any comparable data type: numbers, dates, even strings (alphabetically).
GROUP BY Clause with Aggregates
Aggregate functions become powerful when combined with GROUP BY. This clause partitions your data into groups before applying the aggregate function to each group separately.
-- Total sales per customer
SELECT customer_id,
COUNT(*) as order_count,
SUM(amount) as total_spent
FROM orders
GROUP BY customer_id;
This query groups all orders by customer, then counts and sums within each group. You get one result row per customer.
Here’s a more complex example analyzing product categories:
-- Category analysis
SELECT category,
COUNT(*) as product_count,
AVG(price) as avg_price,
MIN(price) as min_price,
MAX(price) as max_price,
SUM(stock_quantity) as total_inventory
FROM products
GROUP BY category
ORDER BY total_inventory DESC;
This single query gives you a complete inventory overview by category. Notice how multiple aggregate functions work together.
For hierarchical grouping, list multiple columns:
-- Sales by year and month
SELECT strftime('%Y', order_date) as year,
strftime('%m', order_date) as month,
COUNT(*) as orders,
SUM(amount) as revenue
FROM orders
GROUP BY year, month
ORDER BY year, month;
Critical rule: Every column in your SELECT that isn’t inside an aggregate function must appear in the GROUP BY clause. SQLite is lenient about this, but other databases will reject the query. Write portable SQL by following this rule strictly.
HAVING Clause for Filtering Groups
WHERE filters rows before grouping. HAVING filters groups after aggregation. This distinction is fundamental.
-- Customers who placed more than 5 orders
SELECT customer_id,
COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
You can’t use WHERE here because the count doesn’t exist until after grouping. HAVING operates on the aggregated result.
Find product categories with high average prices:
-- Premium categories (average price over $100)
SELECT category,
COUNT(*) as product_count,
AVG(price) as avg_price
FROM products
GROUP BY category
HAVING AVG(price) > 100
ORDER BY avg_price DESC;
Combine WHERE and HAVING to filter at both stages:
-- Active customers with high order frequency
SELECT customer_id,
COUNT(*) as order_count,
SUM(amount) as total_spent
FROM orders
WHERE status = 'completed' -- Filter before grouping
AND order_date >= date('now', '-1 year')
GROUP BY customer_id
HAVING COUNT(*) >= 10 -- Filter after grouping
ORDER BY total_spent DESC;
This query first filters to completed orders from the last year, groups by customer, then keeps only customers with 10+ orders.
Advanced Aggregate Functions
SQLite provides several advanced aggregate functions that handle specific use cases elegantly.
GROUP_CONCAT() concatenates values into a single string:
-- List all products in each category
SELECT category,
GROUP_CONCAT(product_name, ', ') as products
FROM products
GROUP BY category;
-- Tags for each article (with custom separator)
SELECT article_id,
GROUP_CONCAT(tag_name, ' | ') as tags
FROM article_tags
JOIN tags USING(tag_id)
GROUP BY article_id;
This is invaluable for creating comma-separated lists or generating summary strings. The second parameter specifies the separator (default is comma).
TOTAL() is SQLite’s NULL-safe alternative to SUM():
-- SUM returns NULL if all values are NULL
SELECT SUM(discount_amount) FROM orders; -- NULL if no discounts
-- TOTAL returns 0.0 instead
SELECT TOTAL(discount_amount) FROM orders; -- 0.0 if no discounts
TOTAL always returns a floating-point number, even for integer columns. Use it when you need guaranteed numeric output for calculations.
DISTINCT within aggregates counts or sums unique values only:
-- Unique customers who ordered each product
SELECT product_id,
COUNT(DISTINCT customer_id) as unique_customers,
COUNT(*) as total_orders
FROM order_items
GROUP BY product_id;
-- Total unique revenue (if same amount appears multiple times)
SELECT SUM(DISTINCT amount) FROM orders;
This pattern helps analyze customer reach and remove duplicate contributions.
Practical Use Cases and Best Practices
Let’s combine these techniques into real-world queries.
Sales dashboard query:
-- Comprehensive sales metrics
SELECT
strftime('%Y-%m', order_date) as month,
COUNT(*) as total_orders,
COUNT(DISTINCT customer_id) as unique_customers,
SUM(amount) as revenue,
AVG(amount) as avg_order_value,
MIN(amount) as smallest_order,
MAX(amount) as largest_order,
GROUP_CONCAT(DISTINCT payment_method) as payment_methods_used
FROM orders
WHERE status = 'completed'
AND order_date >= date('now', '-6 months')
GROUP BY month
HAVING revenue > 1000 -- Only months with meaningful revenue
ORDER BY month DESC;
This single query provides a complete monthly sales overview with multiple dimensions.
Subqueries with aggregates solve complex problems:
-- Customers who spent above average
SELECT customer_id,
total_spent
FROM (
SELECT customer_id,
SUM(amount) as total_spent
FROM orders
GROUP BY customer_id
)
WHERE total_spent > (
SELECT AVG(total_spent)
FROM (
SELECT SUM(amount) as total_spent
FROM orders
GROUP BY customer_id
)
);
Breaking complex queries into subqueries improves readability and maintainability.
Common pitfalls to avoid:
Don’t mix aggregated and non-aggregated columns without GROUP BY:
-- WRONG: product_name isn't aggregated or grouped
SELECT category, product_name, AVG(price)
FROM products
GROUP BY category;
-- CORRECT: Either aggregate it
SELECT category,
GROUP_CONCAT(product_name) as products,
AVG(price)
FROM products
GROUP BY category;
-- Or include it in GROUP BY
SELECT category, product_name, AVG(price)
FROM products
GROUP BY category, product_name;
Remember that aggregates ignore NULL values (except COUNT(*)):
-- If half your products have NULL prices, AVG only considers non-NULL
SELECT AVG(price) FROM products; -- Excludes NULLs
-- To include NULLs as zero
SELECT AVG(COALESCE(price, 0)) FROM products;
Use appropriate data types. Aggregating text columns with SUM or AVG produces nonsense results without errors.
Performance considerations: Aggregates on indexed columns perform better. If you frequently group by customer_id, index it. For large datasets, aggregation can be expensive—consider materialized views or summary tables for frequently-accessed reports.
Aggregate functions are essential tools in your SQLite toolkit. Master these patterns, understand the GROUP BY and HAVING distinction, and you’ll write efficient analytical queries that answer complex business questions with simple SQL.