SQL - SUM() Function with Examples | Application Architect

Key Insights

SUM() is deceptively simple but becomes powerful when combined with GROUP BY, CASE expressions, and window functions to solve complex aggregation problems
NULL values are silently ignored by SUM(), which can lead to unexpected results—always use COALESCE() when NULL handling matters
Using SUM() with DISTINCT is rarely necessary and often indicates a data modeling problem; understand your data before reaching for this combination

Introduction to SUM()

The SUM() function is one of SQL’s five core aggregate functions, alongside COUNT(), AVG(), MIN(), and MAX(). It does exactly what you’d expect: adds up numeric values and returns the total. Simple in concept, but essential for virtually every reporting query you’ll write.

You’ll reach for SUM() whenever you need to answer questions like “What’s our total revenue?” or “How many hours did the team log this sprint?” It transforms rows of individual values into meaningful totals.

Here’s the basic syntax:

SELECT SUM(column_name)
FROM table_name
WHERE condition;

The function accepts a single numeric expression and returns the sum of all non-NULL values. That last part—non-NULL values—is important and trips up many developers. We’ll cover that in detail later.

Basic SUM() Usage

Let’s work with a practical example. Imagine an orders table that tracks e-commerce transactions:

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    product_category VARCHAR(50),
    region VARCHAR(50),
    order_amount DECIMAL(10, 2),
    order_date DATE
);

The simplest SUM() query calculates the total of all orders:

SELECT SUM(order_amount) AS total_revenue
FROM orders;

This returns a single row with one column containing the sum of every order_amount value in the table.

SUM() works with any numeric data type: INTEGER, DECIMAL, FLOAT, BIGINT, and their variants. Be mindful of the return type—summing integers returns an integer (which can overflow), while summing decimals preserves precision. For financial calculations, always use DECIMAL to avoid floating-point rounding errors:

-- Good: Precise decimal arithmetic
SELECT SUM(order_amount) FROM orders;  -- order_amount is DECIMAL(10,2)

-- Risky: Floating-point can introduce rounding errors
SELECT SUM(CAST(order_amount AS FLOAT)) FROM orders;

SUM() with WHERE Clause

Real-world queries almost always filter data before aggregating. The WHERE clause executes before SUM() processes any rows:

SELECT SUM(order_amount) AS q1_revenue
FROM orders
WHERE order_date >= '2024-01-01'
  AND order_date < '2024-04-01';

You can combine multiple conditions to narrow your aggregation:

SELECT SUM(order_amount) AS electronics_revenue_west
FROM orders
WHERE product_category = 'Electronics'
  AND region = 'West'
  AND order_date >= '2024-01-01';

A common mistake is trying to filter on the aggregated result in the WHERE clause. This won’t work:

-- WRONG: Can't use aggregate in WHERE
SELECT customer_id, SUM(order_amount)
FROM orders
WHERE SUM(order_amount) > 1000
GROUP BY customer_id;

Use HAVING instead to filter after aggregation:

-- CORRECT: HAVING filters aggregated results
SELECT customer_id, SUM(order_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(order_amount) > 1000;

SUM() with GROUP BY

GROUP BY transforms SUM() from a single-value function into a powerful categorization tool. Instead of one total, you get totals for each distinct value in your grouping columns:

SELECT region, SUM(order_amount) AS regional_revenue
FROM orders
GROUP BY region
ORDER BY regional_revenue DESC;

Result:

region    | regional_revenue
----------|------------------
West      | 245000.00
East      | 198000.00
Central   | 156000.00
South     | 142000.00

You can group by multiple columns for more granular breakdowns:

SELECT 
    region,
    product_category,
    SUM(order_amount) AS category_revenue
FROM orders
GROUP BY region, product_category
ORDER BY region, category_revenue DESC;

This produces a row for each unique combination of region and category. It’s the SQL equivalent of a pivot table in spreadsheet software.

Add date-based grouping for time series analysis:

SELECT 
    DATE_TRUNC('month', order_date) AS month,
    region,
    SUM(order_amount) AS monthly_revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date), region
ORDER BY month, region;

SUM() with DISTINCT

SUM(DISTINCT column) adds only unique values, ignoring duplicates:

SELECT SUM(DISTINCT order_amount) AS sum_unique_amounts
FROM orders;

Here’s the thing: you rarely need this. If you’re summing order amounts and getting duplicates, you probably have a join problem or a data modeling issue. Consider this scenario:

-- This query has a bug
SELECT SUM(o.order_amount) AS total
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id;

If an order has three items, the order_amount gets counted three times. The fix isn’t SUM(DISTINCT)—it’s fixing the query:

-- Better: Sum at the correct level
SELECT SUM(order_amount) AS total
FROM orders;

-- Or if you need item-level data too
SELECT SUM(DISTINCT o.order_amount) AS order_total
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id;

Use SUM(DISTINCT) deliberately when you genuinely need unique values summed, not as a band-aid for query problems.

Advanced SUM() Techniques

Conditional Sums with CASE

CASE expressions inside SUM() let you calculate multiple conditional totals in a single query:

SELECT 
    region,
    SUM(order_amount) AS total_revenue,
    SUM(CASE WHEN product_category = 'Electronics' THEN order_amount ELSE 0 END) AS electronics_revenue,
    SUM(CASE WHEN product_category = 'Clothing' THEN order_amount ELSE 0 END) AS clothing_revenue,
    SUM(CASE WHEN order_date >= '2024-01-01' THEN order_amount ELSE 0 END) AS ytd_revenue
FROM orders
GROUP BY region;

This pattern is incredibly useful for building summary reports without multiple subqueries.

You can also use CASE to implement business logic:

SELECT 
    customer_id,
    SUM(CASE 
        WHEN order_amount >= 500 THEN order_amount * 0.10
        WHEN order_amount >= 100 THEN order_amount * 0.05
        ELSE 0 
    END) AS loyalty_points_earned
FROM orders
GROUP BY customer_id;

SUM() as a Window Function

Adding an OVER clause transforms SUM() into a window function, calculating running totals and cumulative sums:

SELECT 
    order_date,
    order_amount,
    SUM(order_amount) OVER (ORDER BY order_date) AS running_total
FROM orders
WHERE customer_id = 1001
ORDER BY order_date;

Result:

order_date  | order_amount | running_total
------------|--------------|---------------
2024-01-05  | 150.00       | 150.00
2024-01-18  | 275.00       | 425.00
2024-02-03  | 89.00        | 514.00
2024-02-14  | 320.00       | 834.00

Partition the window for running totals within categories:

SELECT 
    order_date,
    region,
    order_amount,
    SUM(order_amount) OVER (
        PARTITION BY region 
        ORDER BY order_date
    ) AS regional_running_total
FROM orders
ORDER BY region, order_date;

Handling NULL Values

SUM() ignores NULL values entirely. This query returns 300, not NULL:

SELECT SUM(value) FROM (VALUES (100), (NULL), (200)) AS t(value);
-- Returns: 300

When NULLs matter, use COALESCE to substitute a default:

SELECT SUM(COALESCE(discount_amount, 0)) AS total_discounts
FROM orders;

If all values are NULL, SUM() returns NULL, not zero. Handle this at the outer level:

SELECT COALESCE(SUM(order_amount), 0) AS total_revenue
FROM orders
WHERE order_date = '2099-01-01';  -- No orders exist
-- Returns: 0 (not NULL)

Common Pitfalls and Best Practices

NULL behavior catches everyone eventually. Remember: SUM() of an empty set returns NULL, and SUM() silently skips NULL values. Always wrap in COALESCE when zero is the expected default for “no data.”

Watch for integer overflow. Summing a million rows of integers can exceed INT limits. Use BIGINT for large aggregations or cast explicitly:

SELECT SUM(CAST(quantity AS BIGINT)) AS total_units
FROM order_items;

Index your filter columns, not the summed column. An index on order_amount won’t help SUM() performance. Index the columns in your WHERE and GROUP BY clauses instead:

-- This index helps the query
CREATE INDEX idx_orders_date_region ON orders(order_date, region);

SELECT region, SUM(order_amount)
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY region;

Avoid SELECT * with aggregates. Every non-aggregated column must appear in GROUP BY. Be explicit about what you’re selecting.

Test with edge cases. What happens when there’s no data? When all values are NULL? When amounts are negative? Build these scenarios into your test suite.

SUM() is foundational SQL. Master its nuances—especially NULL handling and window function syntax—and you’ll write cleaner, more efficient queries for every reporting task you encounter.