SQL - IS NULL / IS NOT NULL | Application Architect

Key Insights

NULL represents the absence of a value in SQL, not an empty string or zero, and requires special operators IS NULL and IS NOT NULL for comparison since standard equality operators don’t work with NULL values
Understanding NULL behavior is critical for data integrity, as NULL propagates through most operations and can produce unexpected results in aggregate functions, joins, and WHERE clauses
Proper NULL handling strategies—including COALESCE, NULLIF, and appropriate indexing—can prevent bugs and improve query performance in production systems

Understanding NULL in SQL

NULL is a special marker in SQL that indicates missing, unknown, or inapplicable data. Unlike empty strings (’’) or zeros (0), NULL represents the absence of any value. This distinction matters because NULL behaves differently from all other values in SQL operations.

Standard comparison operators (=, !=, <, >) don’t work with NULL. The expression column = NULL always evaluates to UNKNOWN, not TRUE or FALSE. This is why SQL provides dedicated operators: IS NULL and IS NOT NULL.

-- This will NOT return rows where status is NULL
SELECT * FROM orders WHERE status = NULL;

-- Correct way to check for NULL
SELECT * FROM orders WHERE status IS NULL;

-- Check for non-NULL values
SELECT * FROM orders WHERE status IS NOT NULL;

NULL Behavior in WHERE Clauses

WHERE clauses filter rows based on conditions that evaluate to TRUE. Rows where conditions evaluate to FALSE or UNKNOWN are excluded. This means NULL comparisons can silently exclude data if you’re not careful.

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT,
    salary DECIMAL(10,2)
);

INSERT INTO employees VALUES
(1, 'Alice', NULL, 75000),
(2, 'Bob', 1, 65000),
(3, 'Carol', 1, 70000),
(4, 'Dave', NULL, 80000);

-- Find employees without managers
SELECT name FROM employees WHERE manager_id IS NULL;
-- Returns: Alice, Dave

-- This returns employees with manager_id = 1
SELECT name FROM employees WHERE manager_id = 1;
-- Returns: Bob, Carol

-- Combining conditions with NULL
SELECT name FROM employees 
WHERE manager_id IS NULL OR manager_id = 1;
-- Returns: Alice, Bob, Carol, Dave

NULL in Aggregate Functions

Most aggregate functions ignore NULL values, which can lead to unexpected results if you’re not aware of this behavior. COUNT(*) counts all rows, but COUNT(column) only counts non-NULL values.

CREATE TABLE sales (
    id INT PRIMARY KEY,
    product VARCHAR(50),
    amount DECIMAL(10,2),
    commission DECIMAL(10,2)
);

INSERT INTO sales VALUES
(1, 'Widget', 100.00, 10.00),
(2, 'Gadget', 150.00, NULL),
(3, 'Doohickey', 200.00, 20.00),
(4, 'Thingamajig', 175.00, NULL);

-- Different COUNT behaviors
SELECT 
    COUNT(*) as total_rows,
    COUNT(commission) as rows_with_commission,
    COUNT(amount) as rows_with_amount
FROM sales;
-- Returns: total_rows=4, rows_with_commission=2, rows_with_amount=4

-- AVG ignores NULL values
SELECT 
    AVG(commission) as avg_commission,
    SUM(commission) / COUNT(*) as manual_avg_with_nulls
FROM sales;
-- avg_commission = 15.00 (only considers 10 and 20)
-- manual_avg_with_nulls = 7.50 (includes NULLs as zeros in denominator)

Handling NULL with COALESCE and NULLIF

COALESCE returns the first non-NULL value from a list of arguments. It’s essential for providing default values and preventing NULL propagation in calculations.

-- Using COALESCE for default values
SELECT 
    product,
    amount,
    commission,
    COALESCE(commission, 0) as commission_with_default,
    amount + COALESCE(commission, 0) as total
FROM sales;

-- Multiple fallback values
SELECT 
    name,
    COALESCE(work_phone, mobile_phone, home_phone, 'No phone') as contact_number
FROM contacts;

NULLIF returns NULL if two expressions are equal, otherwise returns the first expression. This is useful for converting specific values to NULL.

-- Convert empty strings to NULL
SELECT 
    NULLIF(TRIM(email), '') as cleaned_email
FROM users;

-- Avoid division by zero
SELECT 
    total_sales,
    total_orders,
    total_sales / NULLIF(total_orders, 0) as avg_order_value
FROM monthly_stats;

NULL in JOINs

NULL values in join columns can cause rows to be excluded unexpectedly. Understanding this behavior is crucial for accurate data retrieval.

CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(100)
);

CREATE TABLE staff (
    staff_id INT PRIMARY KEY,
    staff_name VARCHAR(100),
    dept_id INT
);

INSERT INTO departments VALUES (1, 'Engineering'), (2, 'Sales');
INSERT INTO staff VALUES 
(1, 'Alice', 1),
(2, 'Bob', 2),
(3, 'Carol', NULL),
(4, 'Dave', 3);

-- INNER JOIN excludes NULLs and non-matching values
SELECT s.staff_name, d.dept_name
FROM staff s
INNER JOIN departments d ON s.dept_id = d.dept_id;
-- Returns: Alice (Engineering), Bob (Sales)
-- Excludes: Carol (NULL dept_id), Dave (dept_id=3 doesn't exist)

-- LEFT JOIN preserves NULL values from left table
SELECT s.staff_name, d.dept_name
FROM staff s
LEFT JOIN departments d ON s.dept_id = d.dept_id;
-- Returns all staff, with NULL dept_name for Carol and Dave

-- Find staff without departments
SELECT staff_name
FROM staff
WHERE dept_id IS NULL;
-- Returns: Carol

NULL and Sorting

NULL values have specific sorting behavior that varies by database system, but most treat NULL as either the lowest or highest value.

-- Default NULL sorting (PostgreSQL, MySQL: NULL first in ASC)
SELECT name, salary
FROM employees
ORDER BY salary;

-- Explicit NULL positioning (PostgreSQL, Oracle)
SELECT name, salary
FROM employees
ORDER BY salary NULLS LAST;

SELECT name, salary
FROM employees
ORDER BY salary DESC NULLS FIRST;

-- Portable solution using COALESCE
SELECT name, salary
FROM employees
ORDER BY COALESCE(salary, -1); -- Treats NULL as -1 for sorting

NULL and Indexes

Most database systems don’t index NULL values in single-column indexes, which affects query performance when searching for NULL values.

-- This index won't help with IS NULL queries in many databases
CREATE INDEX idx_manager ON employees(manager_id);

-- Partial index for NULL values (PostgreSQL)
CREATE INDEX idx_no_manager ON employees(id) WHERE manager_id IS NULL;

-- Composite index includes NULLs
CREATE INDEX idx_dept_salary ON employees(department_id, salary);
-- This index CAN be used for IS NULL queries on department_id

Practical NULL Handling Patterns

Here are production-ready patterns for dealing with NULL values effectively:

-- Pattern 1: Safe NULL comparisons in WHERE clauses
SELECT * FROM products
WHERE (discontinued_date IS NULL OR discontinued_date > CURRENT_DATE);

-- Pattern 2: NULL-safe equality comparison
SELECT * FROM users u1
JOIN users u2 ON (
    u1.middle_name = u2.middle_name 
    OR (u1.middle_name IS NULL AND u2.middle_name IS NULL)
);

-- Pattern 3: Conditional aggregates with NULL handling
SELECT 
    department,
    COUNT(*) as total_employees,
    COUNT(bonus) as employees_with_bonus,
    SUM(COALESCE(bonus, 0)) as total_bonus_paid,
    AVG(COALESCE(bonus, 0)) as avg_bonus_including_zeros
FROM employees
GROUP BY department;

-- Pattern 4: Data quality checks
SELECT 
    'Missing email' as issue,
    COUNT(*) as count
FROM users
WHERE email IS NULL
UNION ALL
SELECT 
    'Missing phone' as issue,
    COUNT(*) as count
FROM users
WHERE phone IS NULL;

Understanding NULL behavior isn’t optional—it’s fundamental to writing correct SQL. Every WHERE clause, JOIN condition, and aggregate function must account for NULL values. Use IS NULL and IS NOT NULL explicitly, leverage COALESCE for defaults, and test your queries with NULL data to avoid production surprises.