How to Use LEFT JOIN in PostgreSQL

LEFT JOIN (also called LEFT OUTER JOIN) is PostgreSQL's tool for preserving all rows from your primary table while optionally attaching related data from secondary tables. Unlike INNER JOIN, which...

Key Insights

  • LEFT JOIN returns all rows from the left table and matching rows from the right table, filling in NULLs where no match exists—essential for finding missing relationships and generating complete reports
  • The placement of filter conditions matters: WHERE clause filters happen after the join (removing rows), while ON clause conditions affect which rows match during the join itself
  • Always index your join columns and be explicit about NULL handling with COALESCE or IS NULL checks to avoid performance issues and logical errors in production queries

Understanding LEFT JOIN Fundamentals

LEFT JOIN (also called LEFT OUTER JOIN) is PostgreSQL’s tool for preserving all rows from your primary table while optionally attaching related data from secondary tables. Unlike INNER JOIN, which only returns rows where both tables have matching records, LEFT JOIN guarantees every row from the left table appears in your results.

This distinction is critical. When you need a complete customer list regardless of whether they’ve placed orders, or want to show all products even if they haven’t been sold, LEFT JOIN is your solution. INNER JOIN would silently exclude customers without orders—potentially hiding important business insights about inactive accounts.

Here’s what happens at the row level:

-- Sample data visualization
-- customers table:        orders table:
-- id | name               id | customer_id | amount
-- 1  | Alice              1  | 1           | 100
-- 2  | Bob                2  | 1           | 200
-- 3  | Carol              3  | 3           | 150

-- LEFT JOIN result includes all customers:
-- customer_id | name  | order_id | amount
-- 1           | Alice | 1        | 100
-- 1           | Alice | 2        | 200
-- 2           | Bob   | NULL     | NULL    <- Bob preserved with NULLs
-- 3           | Carol | 3        | 150

Notice Bob appears in the results even though he has no orders. The order columns contain NULL for his row. This is LEFT JOIN’s defining characteristic.

Basic LEFT JOIN Syntax

The syntax follows a straightforward pattern: specify your primary table, declare LEFT JOIN, name your secondary table, and define the relationship in the ON clause.

SELECT 
    c.id,
    c.name,
    c.email,
    o.id AS order_id,
    o.amount,
    o.created_at
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
ORDER BY c.id;

This query retrieves every customer. When a customer has orders, you’ll see one row per order with that order’s details. When a customer has no orders, you’ll see one row with NULL values in the order columns.

The result set might look like:

 id |  name  |        email        | order_id | amount |     created_at      
----+--------+---------------------+----------+--------+---------------------
  1 | Alice  | alice@example.com   |      101 |  49.99 | 2024-01-15 10:30:00
  1 | Alice  | alice@example.com   |      102 | 129.99 | 2024-02-01 14:22:00
  2 | Bob    | bob@example.com     |     NULL |   NULL | NULL
  3 | Carol  | carol@example.com   |      103 |  79.99 | 2024-01-20 09:15:00

Alice appears twice because she has two orders. Bob appears once with NULLs. This is exactly what you want when generating customer reports that must include everyone.

Filtering with WHERE vs ON Clauses

Here’s where developers frequently introduce bugs. Filter placement dramatically changes query behavior with LEFT JOIN.

Conditions in the ON clause affect the join matching process:

-- Filters DURING the join - still returns all customers
SELECT c.name, o.amount
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id 
    AND o.amount > 100
ORDER BY c.name;

This returns all customers. For customers whose orders don’t meet the amount condition, the order columns will be NULL. You’re filtering which orders to attach, not which customers to include.

Conditions in the WHERE clause filter the final result set:

-- Filters AFTER the join - excludes customers without large orders
SELECT c.name, o.amount
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.amount > 100
ORDER BY c.name;

This returns only customers who have orders exceeding 100. Customers without orders disappear because NULL > 100 evaluates to false. You’ve effectively converted this to an INNER JOIN.

To find unmatched records—customers without any orders—use WHERE with an IS NULL check:

SELECT 
    c.id,
    c.name,
    c.email,
    c.created_at
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.id IS NULL
ORDER BY c.created_at DESC;

This pattern is extremely useful for data quality checks and identifying inactive accounts. You’re leveraging the NULL values that LEFT JOIN produces to isolate rows without relationships.

Chaining Multiple LEFT JOINs

Complex queries often require joining multiple tables. Stack LEFT JOINs sequentially, understanding that each join operates on the result of the previous operations.

SELECT 
    c.name AS customer_name,
    o.id AS order_id,
    o.created_at AS order_date,
    oi.product_name,
    oi.quantity,
    oi.price
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
LEFT JOIN order_items oi ON o.id = oi.order_id
ORDER BY c.name, o.created_at, oi.id;

This query shows all customers, their orders (if any), and the items in those orders (if any). A customer without orders appears once with NULLs. A customer with an order but no items appears once with order details but NULL item details.

The order matters. If you reversed the joins (orders LEFT JOIN customers), you’d get all orders but might miss customers without orders—defeating the purpose if your goal is a complete customer list.

Aggregating with LEFT JOIN

LEFT JOIN shines in reporting queries where you need counts or sums that include zero values. Standard aggregation with INNER JOIN would exclude entities with no related records.

SELECT 
    c.id,
    c.name,
    COUNT(o.id) AS order_count,
    COALESCE(SUM(o.amount), 0) AS total_spent,
    MAX(o.created_at) AS last_order_date
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
ORDER BY total_spent DESC;

Key techniques here:

  • Use COUNT(o.id) not COUNT(*). Counting the right table’s primary key gives you zero for unmatched rows. COUNT(*) counts the row itself, giving you one.
  • Wrap SUM() in COALESCE() to convert NULL (no orders) to 0 for cleaner reporting.
  • MAX(o.created_at) naturally returns NULL for customers without orders, which is semantically correct.

This query produces a complete customer list with accurate statistics, including customers who’ve never ordered.

Performance and Common Pitfalls

LEFT JOIN performance depends heavily on indexes. Always index the columns in your ON clause:

CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);

Without these indexes, PostgreSQL performs sequential scans—acceptable for small tables, catastrophic for production data.

Use EXPLAIN ANALYZE to verify your join strategy:

EXPLAIN ANALYZE
SELECT c.name, COUNT(o.id)
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Look for “Hash Join” or “Nested Loop” with index scans. If you see “Seq Scan” on large tables, you need indexes.

Common NULL handling mistakes include:

-- WRONG: This excludes customers without orders
WHERE o.amount > 0

-- RIGHT: Explicitly handle NULLs
WHERE o.amount > 0 OR o.amount IS NULL

-- BETTER: Filter in ON clause if you want all customers
LEFT JOIN orders o ON c.id = o.customer_id AND o.amount > 0

Choose LEFT JOIN when you need all rows from your primary entity. If you only care about matching records, INNER JOIN is faster and clearer. If you need all rows from both tables, use FULL OUTER JOIN (rare in practice).

Real-World Example: Customer Activity Report

Here’s a complete reporting query combining these techniques:

WITH customer_stats AS (
    SELECT 
        c.id,
        c.name,
        c.email,
        c.created_at AS customer_since,
        COUNT(o.id) AS total_orders,
        COALESCE(SUM(o.amount), 0) AS lifetime_value,
        MAX(o.created_at) AS last_order_date,
        CASE 
            WHEN MAX(o.created_at) IS NULL THEN 'Never Ordered'
            WHEN MAX(o.created_at) < NOW() - INTERVAL '90 days' THEN 'Inactive'
            ELSE 'Active'
        END AS status
    FROM customers c
    LEFT JOIN orders o ON c.id = o.customer_id
    WHERE c.created_at >= '2024-01-01'
    GROUP BY c.id, c.name, c.email, c.created_at
)
SELECT 
    status,
    COUNT(*) AS customer_count,
    ROUND(AVG(total_orders), 2) AS avg_orders_per_customer,
    SUM(lifetime_value) AS total_revenue
FROM customer_stats
GROUP BY status
ORDER BY total_revenue DESC;

This query answers business questions like “How many customers have never ordered?” and “What’s our revenue from inactive customers?” Without LEFT JOIN, you’d miss the “Never Ordered” segment entirely—a critical blind spot for marketing teams.

The pattern is: preserve all primary entities with LEFT JOIN, aggregate to get per-entity statistics, then aggregate again to get segment-level insights. This two-level aggregation approach is common in analytics and reporting systems.

Master LEFT JOIN and you’ll write more complete, accurate queries that surface the full picture of your data—including the gaps.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.