SQL - COUNT() Function with Examples
The `COUNT()` function is one of SQL's five core aggregate functions, and arguably the one you'll use most frequently. It returns the number of rows that match a specified condition, making it...
Key Insights
COUNT(*)counts all rows including those with NULL values, whileCOUNT(column_name)excludes NULLs—choosing the wrong one silently gives you incorrect results- Combining
COUNT()withGROUP BYandHAVINGtransforms it from a simple counter into a powerful analytical tool for finding patterns in your data - The
COUNT(*) vs COUNT(1)performance debate is a myth in modern databases—focus instead on proper indexing and avoiding unnecessary DISTINCT operations
Introduction to COUNT()
The COUNT() function is one of SQL’s five core aggregate functions, and arguably the one you’ll use most frequently. It returns the number of rows that match a specified condition, making it essential for data analysis, reporting, and application logic.
Whether you’re building a dashboard showing total orders, validating data quality by counting missing values, or implementing pagination, COUNT() is your go-to tool. Despite its apparent simplicity, developers frequently misuse it—particularly when NULL values enter the picture.
This article covers everything you need to know about COUNT(), from basic syntax to performance optimization. By the end, you’ll understand exactly which variation to use in any situation.
COUNT() Syntax Variations
SQL provides three distinct forms of COUNT(), each serving a specific purpose:
-- Count all rows in a table
SELECT COUNT(*) FROM employees;
-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;
-- Count unique non-NULL values in a column
SELECT COUNT(DISTINCT department) FROM employees;
Let’s establish a sample table we’ll use throughout this article:
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
department VARCHAR(50),
email VARCHAR(100),
salary DECIMAL(10, 2),
hire_date DATE
);
INSERT INTO employees VALUES
(1, 'Alice Chen', 'Engineering', 'alice@company.com', 95000, '2021-03-15'),
(2, 'Bob Martinez', 'Engineering', 'bob@company.com', 87000, '2022-01-10'),
(3, 'Carol White', 'Marketing', NULL, 72000, '2021-07-22'),
(4, 'David Kim', 'Engineering', 'david@company.com', 110000, '2019-11-01'),
(5, 'Eva Johnson', 'Sales', 'eva@company.com', 68000, '2023-02-14'),
(6, 'Frank Brown', NULL, 'frank@company.com', 55000, '2022-09-30'),
(7, 'Grace Lee', 'Marketing', NULL, 78000, '2020-05-18');
Notice that Carol and Grace have NULL emails, and Frank has a NULL department. These NULL values will demonstrate the critical differences between COUNT variations.
COUNT(*) vs COUNT(column_name)
This distinction trips up even experienced developers. The behavior difference is straightforward but has significant implications:
COUNT(*)counts every row, regardless of NULL valuesCOUNT(column_name)counts only rows where that column is NOT NULL
-- Count all employees
SELECT COUNT(*) AS total_employees FROM employees;
-- Result: 7
-- Count employees with email addresses
SELECT COUNT(email) AS employees_with_email FROM employees;
-- Result: 5
-- Count employees with assigned departments
SELECT COUNT(department) AS employees_with_dept FROM employees;
-- Result: 6
Here’s a practical example showing both in a single query:
SELECT
COUNT(*) AS total_rows,
COUNT(email) AS has_email,
COUNT(department) AS has_department,
COUNT(*) - COUNT(email) AS missing_email,
COUNT(*) - COUNT(department) AS missing_department
FROM employees;
Result:
| total_rows | has_email | has_department | missing_email | missing_department |
|---|---|---|---|---|
| 7 | 5 | 6 | 2 | 1 |
This pattern is invaluable for data quality checks. You can quickly identify which columns have missing data and how much.
When to use which:
- Use
COUNT(*)when you need the total number of rows - Use
COUNT(column)when you specifically want to count non-NULL values - Use
COUNT(column)for data quality audits to find missing values
Using COUNT() with DISTINCT
When you need to count unique values rather than total occurrences, COUNT(DISTINCT column) eliminates duplicates before counting:
-- How many unique departments exist?
SELECT COUNT(DISTINCT department) AS unique_departments
FROM employees;
-- Result: 3 (Engineering, Marketing, Sales - NULL is excluded)
-- Compare with total department assignments
SELECT
COUNT(department) AS total_assignments,
COUNT(DISTINCT department) AS unique_departments
FROM employees;
-- Result: 6 assignments, 3 unique departments
A common real-world use case is counting unique customers who placed orders:
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
INSERT INTO orders VALUES
(1, 101, '2024-01-15', 250.00),
(2, 102, '2024-01-16', 175.50),
(3, 101, '2024-01-17', 89.99),
(4, 103, '2024-01-18', 432.00),
(5, 101, '2024-01-19', 156.75),
(6, 102, '2024-01-20', 299.99);
-- Total orders vs unique customers
SELECT
COUNT(*) AS total_orders,
COUNT(DISTINCT customer_id) AS unique_customers
FROM orders;
-- Result: 6 orders, 3 unique customers
You can also count distinct combinations using concatenation or multiple columns (syntax varies by database):
-- PostgreSQL/MySQL: Count unique year-month combinations
SELECT COUNT(DISTINCT DATE_TRUNC('month', order_date)) AS unique_months
FROM orders;
-- SQL Server alternative
SELECT COUNT(DISTINCT CONCAT(YEAR(order_date), '-', MONTH(order_date)))
FROM orders;
COUNT() with WHERE and GROUP BY
The real power of COUNT() emerges when combined with filtering and grouping. The WHERE clause filters rows before counting, while GROUP BY creates separate counts for each category.
-- Count employees hired in 2022 or later
SELECT COUNT(*) AS recent_hires
FROM employees
WHERE hire_date >= '2022-01-01';
-- Result: 3
-- Count employees per department
SELECT
department,
COUNT(*) AS employee_count
FROM employees
WHERE department IS NOT NULL
GROUP BY department
ORDER BY employee_count DESC;
Result:
| department | employee_count |
|---|---|
| Engineering | 3 |
| Marketing | 2 |
| Sales | 1 |
Here’s a more complex example combining multiple conditions:
-- Count orders by month with total revenue
SELECT
DATE_TRUNC('month', order_date) AS month,
COUNT(*) AS order_count,
COUNT(DISTINCT customer_id) AS unique_customers,
SUM(total_amount) AS total_revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;
For the employees table, let’s analyze hiring patterns:
-- Employees hired per year with salary insights
SELECT
EXTRACT(YEAR FROM hire_date) AS hire_year,
COUNT(*) AS employees_hired,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY EXTRACT(YEAR FROM hire_date)
ORDER BY hire_year;
COUNT() with HAVING Clause
While WHERE filters rows before aggregation, HAVING filters groups after aggregation. This lets you filter based on count values:
-- Find departments with more than 1 employee
SELECT
department,
COUNT(*) AS employee_count
FROM employees
WHERE department IS NOT NULL
GROUP BY department
HAVING COUNT(*) > 1;
Result:
| department | employee_count |
|---|---|
| Engineering | 3 |
| Marketing | 2 |
A practical e-commerce example:
-- Find customers who placed more than 2 orders (loyal customers)
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 2;
-- Result: Customer 101 with 3 orders
You can combine WHERE and HAVING for precise filtering:
-- Departments with more than 1 employee earning above 70k
SELECT
department,
COUNT(*) AS high_earners
FROM employees
WHERE salary > 70000 AND department IS NOT NULL
GROUP BY department
HAVING COUNT(*) > 1;
Performance Considerations
Understanding COUNT() performance helps you write efficient queries at scale.
The COUNT(*) vs COUNT(1) Myth
You’ll hear developers claim COUNT(1) is faster than COUNT(*). This was potentially true decades ago, but modern query optimizers treat them identically. Use COUNT(*) for clarity—it explicitly communicates “count all rows.”
-- These are equivalent in modern databases
SELECT COUNT(*) FROM employees;
SELECT COUNT(1) FROM employees;
Indexing Impact
COUNT(*) on an entire table can be expensive for large tables. Some strategies:
-- If you have an indexed column, the optimizer may use an index-only scan
SELECT COUNT(*) FROM employees;
-- Counting with a WHERE clause benefits from indexes
CREATE INDEX idx_department ON employees(department);
SELECT COUNT(*) FROM employees WHERE department = 'Engineering';
COUNT(DISTINCT) Performance
DISTINCT operations require sorting or hashing to eliminate duplicates, making them more expensive than simple counts. On large datasets, consider whether you truly need exact counts:
-- Expensive on large tables
SELECT COUNT(DISTINCT customer_id) FROM orders;
-- PostgreSQL approximate alternative for huge tables
SELECT COUNT(DISTINCT customer_id) FROM orders; -- exact
-- vs using HyperLogLog extensions for approximations
Caching Counts
For frequently accessed counts on large tables, consider materialized views or counter caches:
-- Materialized view for department statistics
CREATE MATERIALIZED VIEW department_stats AS
SELECT
department,
COUNT(*) AS employee_count,
AVG(salary) AS avg_salary
FROM employees
WHERE department IS NOT NULL
GROUP BY department;
-- Refresh periodically
REFRESH MATERIALIZED VIEW department_stats;
The COUNT() function appears simple but understanding its nuances—particularly around NULL handling and performance—separates competent SQL developers from those who produce subtly incorrect or inefficient queries. Master these patterns and you’ll handle any counting requirement with confidence.