SQL - COUNT() Function with Examples

The `COUNT()` function is one of SQL's five core aggregate functions, and arguably the one you'll use most frequently. It returns the number of rows that match a specified condition, making it...

Key Insights

  • COUNT(*) counts all rows including those with NULL values, while COUNT(column_name) excludes NULLs—choosing the wrong one silently gives you incorrect results
  • Combining COUNT() with GROUP BY and HAVING transforms it from a simple counter into a powerful analytical tool for finding patterns in your data
  • The COUNT(*) vs COUNT(1) performance debate is a myth in modern databases—focus instead on proper indexing and avoiding unnecessary DISTINCT operations

Introduction to COUNT()

The COUNT() function is one of SQL’s five core aggregate functions, and arguably the one you’ll use most frequently. It returns the number of rows that match a specified condition, making it essential for data analysis, reporting, and application logic.

Whether you’re building a dashboard showing total orders, validating data quality by counting missing values, or implementing pagination, COUNT() is your go-to tool. Despite its apparent simplicity, developers frequently misuse it—particularly when NULL values enter the picture.

This article covers everything you need to know about COUNT(), from basic syntax to performance optimization. By the end, you’ll understand exactly which variation to use in any situation.

COUNT() Syntax Variations

SQL provides three distinct forms of COUNT(), each serving a specific purpose:

-- Count all rows in a table
SELECT COUNT(*) FROM employees;

-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;

-- Count unique non-NULL values in a column
SELECT COUNT(DISTINCT department) FROM employees;

Let’s establish a sample table we’ll use throughout this article:

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department VARCHAR(50),
    email VARCHAR(100),
    salary DECIMAL(10, 2),
    hire_date DATE
);

INSERT INTO employees VALUES
(1, 'Alice Chen', 'Engineering', 'alice@company.com', 95000, '2021-03-15'),
(2, 'Bob Martinez', 'Engineering', 'bob@company.com', 87000, '2022-01-10'),
(3, 'Carol White', 'Marketing', NULL, 72000, '2021-07-22'),
(4, 'David Kim', 'Engineering', 'david@company.com', 110000, '2019-11-01'),
(5, 'Eva Johnson', 'Sales', 'eva@company.com', 68000, '2023-02-14'),
(6, 'Frank Brown', NULL, 'frank@company.com', 55000, '2022-09-30'),
(7, 'Grace Lee', 'Marketing', NULL, 78000, '2020-05-18');

Notice that Carol and Grace have NULL emails, and Frank has a NULL department. These NULL values will demonstrate the critical differences between COUNT variations.

COUNT(*) vs COUNT(column_name)

This distinction trips up even experienced developers. The behavior difference is straightforward but has significant implications:

  • COUNT(*) counts every row, regardless of NULL values
  • COUNT(column_name) counts only rows where that column is NOT NULL
-- Count all employees
SELECT COUNT(*) AS total_employees FROM employees;
-- Result: 7

-- Count employees with email addresses
SELECT COUNT(email) AS employees_with_email FROM employees;
-- Result: 5

-- Count employees with assigned departments
SELECT COUNT(department) AS employees_with_dept FROM employees;
-- Result: 6

Here’s a practical example showing both in a single query:

SELECT 
    COUNT(*) AS total_rows,
    COUNT(email) AS has_email,
    COUNT(department) AS has_department,
    COUNT(*) - COUNT(email) AS missing_email,
    COUNT(*) - COUNT(department) AS missing_department
FROM employees;

Result:

total_rows has_email has_department missing_email missing_department
7 5 6 2 1

This pattern is invaluable for data quality checks. You can quickly identify which columns have missing data and how much.

When to use which:

  • Use COUNT(*) when you need the total number of rows
  • Use COUNT(column) when you specifically want to count non-NULL values
  • Use COUNT(column) for data quality audits to find missing values

Using COUNT() with DISTINCT

When you need to count unique values rather than total occurrences, COUNT(DISTINCT column) eliminates duplicates before counting:

-- How many unique departments exist?
SELECT COUNT(DISTINCT department) AS unique_departments 
FROM employees;
-- Result: 3 (Engineering, Marketing, Sales - NULL is excluded)

-- Compare with total department assignments
SELECT 
    COUNT(department) AS total_assignments,
    COUNT(DISTINCT department) AS unique_departments
FROM employees;
-- Result: 6 assignments, 3 unique departments

A common real-world use case is counting unique customers who placed orders:

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount DECIMAL(10, 2)
);

INSERT INTO orders VALUES
(1, 101, '2024-01-15', 250.00),
(2, 102, '2024-01-16', 175.50),
(3, 101, '2024-01-17', 89.99),
(4, 103, '2024-01-18', 432.00),
(5, 101, '2024-01-19', 156.75),
(6, 102, '2024-01-20', 299.99);

-- Total orders vs unique customers
SELECT 
    COUNT(*) AS total_orders,
    COUNT(DISTINCT customer_id) AS unique_customers
FROM orders;
-- Result: 6 orders, 3 unique customers

You can also count distinct combinations using concatenation or multiple columns (syntax varies by database):

-- PostgreSQL/MySQL: Count unique year-month combinations
SELECT COUNT(DISTINCT DATE_TRUNC('month', order_date)) AS unique_months
FROM orders;

-- SQL Server alternative
SELECT COUNT(DISTINCT CONCAT(YEAR(order_date), '-', MONTH(order_date))) 
FROM orders;

COUNT() with WHERE and GROUP BY

The real power of COUNT() emerges when combined with filtering and grouping. The WHERE clause filters rows before counting, while GROUP BY creates separate counts for each category.

-- Count employees hired in 2022 or later
SELECT COUNT(*) AS recent_hires
FROM employees
WHERE hire_date >= '2022-01-01';
-- Result: 3

-- Count employees per department
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
WHERE department IS NOT NULL
GROUP BY department
ORDER BY employee_count DESC;

Result:

department employee_count
Engineering 3
Marketing 2
Sales 1

Here’s a more complex example combining multiple conditions:

-- Count orders by month with total revenue
SELECT 
    DATE_TRUNC('month', order_date) AS month,
    COUNT(*) AS order_count,
    COUNT(DISTINCT customer_id) AS unique_customers,
    SUM(total_amount) AS total_revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

For the employees table, let’s analyze hiring patterns:

-- Employees hired per year with salary insights
SELECT 
    EXTRACT(YEAR FROM hire_date) AS hire_year,
    COUNT(*) AS employees_hired,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY EXTRACT(YEAR FROM hire_date)
ORDER BY hire_year;

COUNT() with HAVING Clause

While WHERE filters rows before aggregation, HAVING filters groups after aggregation. This lets you filter based on count values:

-- Find departments with more than 1 employee
SELECT 
    department,
    COUNT(*) AS employee_count
FROM employees
WHERE department IS NOT NULL
GROUP BY department
HAVING COUNT(*) > 1;

Result:

department employee_count
Engineering 3
Marketing 2

A practical e-commerce example:

-- Find customers who placed more than 2 orders (loyal customers)
SELECT 
    customer_id,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 2;
-- Result: Customer 101 with 3 orders

You can combine WHERE and HAVING for precise filtering:

-- Departments with more than 1 employee earning above 70k
SELECT 
    department,
    COUNT(*) AS high_earners
FROM employees
WHERE salary > 70000 AND department IS NOT NULL
GROUP BY department
HAVING COUNT(*) > 1;

Performance Considerations

Understanding COUNT() performance helps you write efficient queries at scale.

The COUNT(*) vs COUNT(1) Myth

You’ll hear developers claim COUNT(1) is faster than COUNT(*). This was potentially true decades ago, but modern query optimizers treat them identically. Use COUNT(*) for clarity—it explicitly communicates “count all rows.”

-- These are equivalent in modern databases
SELECT COUNT(*) FROM employees;
SELECT COUNT(1) FROM employees;

Indexing Impact

COUNT(*) on an entire table can be expensive for large tables. Some strategies:

-- If you have an indexed column, the optimizer may use an index-only scan
SELECT COUNT(*) FROM employees;

-- Counting with a WHERE clause benefits from indexes
CREATE INDEX idx_department ON employees(department);
SELECT COUNT(*) FROM employees WHERE department = 'Engineering';

COUNT(DISTINCT) Performance

DISTINCT operations require sorting or hashing to eliminate duplicates, making them more expensive than simple counts. On large datasets, consider whether you truly need exact counts:

-- Expensive on large tables
SELECT COUNT(DISTINCT customer_id) FROM orders;

-- PostgreSQL approximate alternative for huge tables
SELECT COUNT(DISTINCT customer_id) FROM orders; -- exact
-- vs using HyperLogLog extensions for approximations

Caching Counts

For frequently accessed counts on large tables, consider materialized views or counter caches:

-- Materialized view for department statistics
CREATE MATERIALIZED VIEW department_stats AS
SELECT 
    department,
    COUNT(*) AS employee_count,
    AVG(salary) AS avg_salary
FROM employees
WHERE department IS NOT NULL
GROUP BY department;

-- Refresh periodically
REFRESH MATERIALIZED VIEW department_stats;

The COUNT() function appears simple but understanding its nuances—particularly around NULL handling and performance—separates competent SQL developers from those who produce subtly incorrect or inefficient queries. Master these patterns and you’ll handle any counting requirement with confidence.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.