How to Use JOIN in MySQL
Relational databases store data across multiple tables to reduce redundancy and maintain data integrity. JOINs let you recombine that data when you need it. Without JOINs, you'd be stuck making...
Key Insights
- JOINs are the foundation of relational database queries—master INNER, LEFT, and RIGHT JOINs to handle 95% of real-world scenarios
- MySQL lacks native FULL OUTER JOIN support, but a UNION of LEFT and RIGHT JOINs achieves the same result
- Always index your foreign key columns and use EXPLAIN to catch performance issues before they hit production
Introduction to JOINs
Relational databases store data across multiple tables to reduce redundancy and maintain data integrity. JOINs let you recombine that data when you need it. Without JOINs, you’d be stuck making multiple queries and stitching results together in application code—a slow, error-prone approach.
A JOIN combines rows from two or more tables based on a related column between them. Think of it as matching rows where a condition is true. The most common condition is matching a foreign key in one table to a primary key in another.
Let’s work with a simple e-commerce schema throughout this article:
CREATE TABLE customers (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE
);
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT,
order_date DATE,
total DECIMAL(10, 2),
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
price DECIMAL(10, 2)
);
CREATE TABLE order_items (
id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT,
product_id INT,
quantity INT,
FOREIGN KEY (order_id) REFERENCES orders(id),
FOREIGN KEY (product_id) REFERENCES products(id)
);
INNER JOIN
INNER JOIN returns only rows where the join condition matches in both tables. If a row in the left table has no corresponding row in the right table, it’s excluded from the results. This is the most common JOIN type and what most people mean when they say “JOIN” without qualification.
SELECT
customers.name,
customers.email,
orders.id AS order_id,
orders.order_date,
orders.total
FROM customers
INNER JOIN orders ON customers.id = orders.customer_id;
This query returns only customers who have placed at least one order. Customers with no orders don’t appear in the results at all. Each customer appears once per order they’ve placed.
You can use table aliases to make queries more readable:
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100.00
ORDER BY o.order_date DESC;
The INNER keyword is optional—JOIN alone defaults to INNER JOIN. I recommend being explicit for clarity, especially in codebases where multiple developers work together.
LEFT JOIN (LEFT OUTER JOIN)
LEFT JOIN returns all rows from the left table (the one after FROM), plus matched rows from the right table. When there’s no match, the right table’s columns contain NULL.
This is essential when you want to include records that might not have related data:
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id;
Now every customer appears in the results. Customers without orders show NULL for all order columns. This is perfect for reports that need to show “customers who haven’t ordered” or calculating metrics across your entire customer base.
To find customers who have never placed an order:
SELECT
c.name,
c.email
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
WHERE o.id IS NULL;
This pattern—LEFT JOIN followed by checking for NULL in the right table’s primary key—is a common way to find “orphaned” or “missing” relationships.
RIGHT JOIN (RIGHT OUTER JOIN)
RIGHT JOIN is the inverse of LEFT JOIN. It returns all rows from the right table plus matched rows from the left. In practice, you can always rewrite a RIGHT JOIN as a LEFT JOIN by swapping the table order, so many developers avoid RIGHT JOIN entirely for consistency.
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM customers c
RIGHT JOIN orders o ON c.id = o.customer_id;
This returns all orders, even if the customer record is missing (which could happen due to data integrity issues or if you allow anonymous orders). Orders without a matching customer show NULL for customer columns.
The equivalent LEFT JOIN:
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM orders o
LEFT JOIN customers c ON c.id = o.customer_id;
I recommend picking one style (usually LEFT JOIN) and sticking with it. Mixing LEFT and RIGHT JOINs in the same codebase creates unnecessary cognitive load.
FULL OUTER JOIN (Workaround)
FULL OUTER JOIN returns all rows from both tables, with NULLs where there’s no match on either side. MySQL doesn’t support this natively, but you can simulate it by combining LEFT and RIGHT JOINs with UNION:
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
UNION
SELECT
c.name,
c.email,
o.id AS order_id,
o.order_date,
o.total
FROM customers c
RIGHT JOIN orders o ON c.id = o.customer_id
WHERE c.id IS NULL;
The WHERE clause in the second query prevents duplicate rows. Without it, matching rows would appear twice (once from each JOIN).
FULL OUTER JOIN is rarely needed in practice. If you find yourself reaching for it frequently, reconsider your schema design—it often indicates a missing relationship table or unclear data ownership.
Self JOIN and Multiple Table JOINs
A self JOIN joins a table to itself. This is useful for hierarchical data like organizational structures or category trees.
Consider an employees table with a manager relationship:
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
manager_id INT,
FOREIGN KEY (manager_id) REFERENCES employees(id)
);
To list employees with their manager’s name:
SELECT
e.name AS employee_name,
m.name AS manager_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Table aliases are mandatory here since you’re referencing the same table twice. The LEFT JOIN ensures employees without managers (like the CEO) still appear in results.
For multiple table JOINs, chain them together. Here’s a query that shows customer orders with product details:
SELECT
c.name AS customer_name,
o.order_date,
p.name AS product_name,
oi.quantity,
p.price,
(oi.quantity * p.price) AS line_total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
WHERE o.order_date >= '2024-01-01'
ORDER BY o.order_date DESC, c.name;
Each JOIN builds on the previous one. The query starts with customers, joins to orders, then to order_items, and finally to products. The order matters for readability but not for performance—MySQL’s query optimizer rearranges JOINs for efficiency.
Performance Tips and Best Practices
JOINs can be expensive operations. Here’s how to keep them fast.
Index your foreign keys. This is the single most important optimization. Without indexes, MySQL performs full table scans for each JOIN:
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);
Select only the columns you need. Avoid SELECT * in production code. Fetching unnecessary columns wastes memory and network bandwidth:
-- Bad: fetches all columns
SELECT * FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;
-- Good: fetches only what's needed
SELECT c.name, o.order_date, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;
Use EXPLAIN to analyze queries. EXPLAIN shows you how MySQL plans to execute a query:
EXPLAIN SELECT
c.name,
o.order_date,
o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.total > 100.00;
Look for these warning signs in EXPLAIN output:
type: ALLindicates a full table scanrowsshowing large numbers means many rows are being examinedExtra: Using filesortorUsing temporarysuggests expensive operations
Filter early. Put WHERE conditions on indexed columns to reduce the number of rows being joined:
-- Better: filter orders first, then join
SELECT c.name, o.order_date, o.total
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id
WHERE o.order_date >= '2024-01-01';
Consider denormalization for read-heavy workloads. If you’re constantly joining the same tables for reports, adding redundant columns or creating summary tables might be worth the trade-off.
JOINs are fundamental to working with relational data. Master these patterns, index your foreign keys, and use EXPLAIN when queries slow down. That’s 90% of what you need to write efficient MySQL queries.