SQL - FULL OUTER JOIN | Application Architect

Key Insights

FULL OUTER JOIN returns all records from both tables, filling in NULLs where no match exists—making it the ideal tool for data reconciliation and finding mismatches between datasets.
MySQL doesn’t support FULL OUTER JOIN natively, but you can replicate it using a UNION of LEFT and RIGHT JOINs.
Use COALESCE() to handle NULL values gracefully and add WHERE clauses to isolate unmatched records when debugging data discrepancies.

Introduction to FULL OUTER JOIN

A FULL OUTER JOIN combines the behavior of both LEFT and RIGHT joins into a single operation. It returns every row from both tables in the join, matching rows where possible and filling in NULL values where no match exists on either side.

Think of it as asking the database: “Give me everything from both tables, and show me where they connect—or don’t.”

Here’s the basic syntax:

SELECT columns
FROM table_a
FULL OUTER JOIN table_b
    ON table_a.key = table_b.key;

This query returns all rows from table_a and all rows from table_b. When a row in table_a has a matching row in table_b, you get the combined data. When there’s no match, you get NULLs for the columns from the table that lacks a corresponding row.

How FULL OUTER JOIN Differs from Other Joins

Understanding FULL OUTER JOIN requires comparing it to other join types. Let’s use two simple tables to illustrate the differences.

-- Sample data setup
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(50),
    department_id INT
);

CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(50)
);

INSERT INTO employees VALUES 
    (1, 'Alice', 10),
    (2, 'Bob', 20),
    (3, 'Charlie', NULL);

INSERT INTO departments VALUES 
    (10, 'Engineering'),
    (20, 'Marketing'),
    (30, 'Finance');

Now let’s see how each join type handles this data:

-- INNER JOIN: Only matching rows (2 rows)
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;
-- Result: Alice/Engineering, Bob/Marketing

-- LEFT JOIN: All employees, matching departments (3 rows)
SELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;
-- Result: Alice/Engineering, Bob/Marketing, Charlie/NULL

-- RIGHT JOIN: All departments, matching employees (3 rows)
SELECT e.name, d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;
-- Result: Alice/Engineering, Bob/Marketing, NULL/Finance

-- FULL OUTER JOIN: Everything from both tables (4 rows)
SELECT e.name, d.department_name
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.department_id;
-- Result: Alice/Engineering, Bob/Marketing, Charlie/NULL, NULL/Finance

The FULL OUTER JOIN captures both Charlie (who has no department) and Finance (which has no employees). No other single join type gives you this complete picture.

Basic FULL OUTER JOIN Syntax and Usage

Let’s break down the construction of a FULL OUTER JOIN query step by step.

SELECT 
    e.employee_id,
    e.name,
    e.department_id AS emp_dept_id,
    d.department_id AS dept_id,
    d.department_name
FROM employees e
FULL OUTER JOIN departments d 
    ON e.department_id = d.department_id
ORDER BY e.employee_id, d.department_id;

The query structure follows this pattern:

SELECT clause: Specify columns from both tables. I recommend aliasing ambiguous columns (like department_id which exists in both tables) to make the output clear.
FROM clause: Start with your first table and give it an alias for readability.
FULL OUTER JOIN clause: Specify the second table with its alias.
ON clause: Define the join condition. This is typically a primary key to foreign key relationship.
ORDER BY clause: Optional but helpful for consistent, readable output.

You can chain multiple FULL OUTER JOINs together:

SELECT 
    e.name,
    d.department_name,
    l.city
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.department_id
FULL OUTER JOIN locations l ON d.location_id = l.location_id;

Be cautious with multiple FULL OUTER JOINs—the result set can grow large quickly, and the logic becomes harder to reason about.

Practical Use Cases

FULL OUTER JOIN shines in specific scenarios. Here are the situations where you should reach for it.

Data Reconciliation Between Systems

When comparing data from two sources that should match but might have discrepancies:

-- Compare inventory counts between warehouse system and ERP
SELECT 
    COALESCE(w.product_id, e.product_id) AS product_id,
    w.quantity AS warehouse_qty,
    e.quantity AS erp_qty,
    CASE 
        WHEN w.product_id IS NULL THEN 'Missing from warehouse'
        WHEN e.product_id IS NULL THEN 'Missing from ERP'
        WHEN w.quantity != e.quantity THEN 'Quantity mismatch'
        ELSE 'OK'
    END AS status
FROM warehouse_inventory w
FULL OUTER JOIN erp_inventory e 
    ON w.product_id = e.product_id
WHERE w.product_id IS NULL 
   OR e.product_id IS NULL 
   OR w.quantity != e.quantity;

This query immediately surfaces three types of problems: items in the warehouse but not the ERP, items in the ERP but not the warehouse, and quantity mismatches.

Finding Orphaned Records

Identify records that exist in one table but lack corresponding entries in related tables:

-- Find orders without customers AND customers without orders
SELECT 
    c.customer_id,
    c.customer_name,
    o.order_id,
    o.order_date
FROM customers c
FULL OUTER JOIN orders o ON c.customer_id = o.customer_id
WHERE c.customer_id IS NULL OR o.order_id IS NULL;

Merging Incomplete Data Sources

When consolidating data from multiple sources where neither is complete:

-- Merge customer data from two acquired companies
SELECT 
    COALESCE(a.email, b.email) AS email,
    COALESCE(a.name, b.name) AS name,
    a.loyalty_points AS company_a_points,
    b.loyalty_points AS company_b_points,
    COALESCE(a.loyalty_points, 0) + COALESCE(b.loyalty_points, 0) AS total_points
FROM company_a_customers a
FULL OUTER JOIN company_b_customers b ON a.email = b.email;

Handling NULL Values in Results

FULL OUTER JOIN produces NULLs by design. Here’s how to manage them effectively.

Using COALESCE for Clean Output

COALESCE returns the first non-NULL value from its arguments:

SELECT 
    COALESCE(e.employee_id, -1) AS employee_id,
    COALESCE(e.name, 'Unassigned') AS employee_name,
    COALESCE(d.department_name, 'No Department') AS department_name
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.department_id;

Filtering for Specific Match States

Use WHERE clauses with NULL checks to isolate different scenarios:

-- Only unmatched records from the left table
SELECT * FROM table_a a
FULL OUTER JOIN table_b b ON a.key = b.key
WHERE b.key IS NULL;

-- Only unmatched records from the right table
SELECT * FROM table_a a
FULL OUTER JOIN table_b b ON a.key = b.key
WHERE a.key IS NULL;

-- Only unmatched records from either table
SELECT * FROM table_a a
FULL OUTER JOIN table_b b ON a.key = b.key
WHERE a.key IS NULL OR b.key IS NULL;

-- Only matched records (equivalent to INNER JOIN)
SELECT * FROM table_a a
FULL OUTER JOIN table_b b ON a.key = b.key
WHERE a.key IS NOT NULL AND b.key IS NOT NULL;

Database Compatibility and Workarounds

Not all databases support FULL OUTER JOIN. Here’s the compatibility landscape:

Full support: PostgreSQL, SQL Server, Oracle, SQLite (3.39+)
No native support: MySQL, MariaDB

For MySQL, simulate FULL OUTER JOIN using a UNION of LEFT and RIGHT joins:

-- MySQL workaround for FULL OUTER JOIN
SELECT 
    e.employee_id,
    e.name,
    d.department_id,
    d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id

UNION

SELECT 
    e.employee_id,
    e.name,
    d.department_id,
    d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;

The UNION automatically removes duplicates (the matched rows that appear in both queries). If you need to preserve duplicates for some reason, use UNION ALL with an additional WHERE clause:

SELECT e.*, d.*
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id

UNION ALL

SELECT e.*, d.*
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id
WHERE e.employee_id IS NULL;

Performance Considerations

FULL OUTER JOIN can be expensive. Here’s how to keep it performant.

Index your join columns. This is non-negotiable. Without indexes on the columns in your ON clause, the database must perform full table scans on both tables.

CREATE INDEX idx_employees_dept ON employees(department_id);
CREATE INDEX idx_departments_id ON departments(department_id);

Filter early when possible. If you only need a subset of data, apply WHERE conditions that can be pushed down:

-- Less efficient: filter after join
SELECT * FROM large_table_a a
FULL OUTER JOIN large_table_b b ON a.key = b.key
WHERE a.created_date > '2024-01-01';

-- More efficient: filter in subquery
SELECT * FROM (
    SELECT * FROM large_table_a WHERE created_date > '2024-01-01'
) a
FULL OUTER JOIN large_table_b b ON a.key = b.key;

Consider alternatives for large datasets. If you’re working with millions of rows and only need to find mismatches, consider using EXISTS/NOT EXISTS subqueries or breaking the operation into separate LEFT JOIN queries that you analyze independently.

Check your execution plan. Use EXPLAIN (PostgreSQL) or EXPLAIN ANALYZE to understand how the database processes your FULL OUTER JOIN. Look for sequential scans on large tables and missing index usage.

FULL OUTER JOIN is a powerful tool for data analysis and reconciliation. Use it when you genuinely need the complete picture from both tables—but understand its cost and apply it judiciously.