SQL - SELECT Statement with Examples
The SELECT statement retrieves data from database tables. At its core, it specifies which columns to return and from which table.
Key Insights
- The SELECT statement is SQL’s primary data retrieval mechanism, supporting everything from simple column selection to complex joins, aggregations, and subqueries across multiple tables
- Performance optimization requires understanding execution order (FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY) and proper indexing strategies for filtering and sorting operations
- Modern SQL extends basic SELECT with window functions, CTEs, and set operations to handle analytical queries that would otherwise require application-level processing
Basic SELECT Syntax
The SELECT statement retrieves data from database tables. At its core, it specifies which columns to return and from which table.
-- Select specific columns
SELECT first_name, last_name, email
FROM users;
-- Select all columns
SELECT *
FROM users;
-- Select with column aliases
SELECT
first_name AS "First Name",
last_name AS "Last Name",
email AS "Email Address"
FROM users;
The asterisk (*) selects all columns but should be avoided in production code. Explicitly naming columns improves performance, reduces network overhead, and makes queries maintainable when table schemas change.
Filtering with WHERE
The WHERE clause filters rows based on specified conditions. It executes before SELECT, affecting which rows the database processes.
-- Single condition
SELECT product_name, price
FROM products
WHERE price > 100;
-- Multiple conditions with AND/OR
SELECT product_name, price, category
FROM products
WHERE price BETWEEN 50 AND 200
AND category IN ('Electronics', 'Computers')
AND stock_quantity > 0;
-- Pattern matching with LIKE
SELECT customer_name, email
FROM customers
WHERE email LIKE '%@gmail.com'
OR customer_name LIKE 'John%';
-- NULL handling
SELECT order_id, shipped_date
FROM orders
WHERE shipped_date IS NULL;
Key operators include comparison (=, !=, <, >, <=, >=), logical (AND, OR, NOT), range (BETWEEN), membership (IN), and pattern matching (LIKE). Always use IS NULL or IS NOT NULL for NULL comparisons—equality operators don’t work with NULL values.
Sorting Results with ORDER BY
ORDER BY controls result set ordering. It executes last in the logical query processing order, after all filtering and aggregation.
-- Single column sort
SELECT product_name, price
FROM products
ORDER BY price DESC;
-- Multiple column sort
SELECT category, product_name, price
FROM products
ORDER BY category ASC, price DESC;
-- Sort by column position (avoid in production)
SELECT product_name, price, stock_quantity
FROM products
ORDER BY 2 DESC, 3 ASC;
-- Sort with expressions
SELECT first_name, last_name, salary
FROM employees
ORDER BY (salary * 1.1) DESC;
Ascending (ASC) is default. Sorting by column position is fragile—query modifications break it. For performance, ensure ORDER BY columns are indexed, especially for large result sets.
Limiting Results
Different databases use different syntax for limiting rows returned.
-- PostgreSQL, MySQL (modern versions)
SELECT product_name, price
FROM products
ORDER BY price DESC
LIMIT 10 OFFSET 20;
-- SQL Server
SELECT TOP 10 product_name, price
FROM products
ORDER BY price DESC;
-- SQL Server with OFFSET (SQL Server 2012+)
SELECT product_name, price
FROM products
ORDER BY price DESC
OFFSET 20 ROWS
FETCH NEXT 10 ROWS ONLY;
-- Oracle
SELECT product_name, price
FROM products
ORDER BY price DESC
FETCH FIRST 10 ROWS ONLY;
LIMIT with OFFSET enables pagination but becomes inefficient for large offsets. Consider keyset pagination for better performance on large datasets.
Aggregation Functions
Aggregate functions compute single values from multiple rows. They typically combine with GROUP BY for grouped calculations.
-- Basic aggregations
SELECT
COUNT(*) AS total_orders,
COUNT(DISTINCT customer_id) AS unique_customers,
SUM(order_total) AS revenue,
AVG(order_total) AS average_order_value,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order
FROM orders;
-- GROUP BY for grouped aggregations
SELECT
category,
COUNT(*) AS product_count,
AVG(price) AS avg_price,
SUM(stock_quantity) AS total_stock
FROM products
GROUP BY category
ORDER BY avg_price DESC;
-- HAVING filters aggregated results
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(order_total) AS total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) >= 5
AND SUM(order_total) > 1000
ORDER BY total_spent DESC;
WHERE filters before aggregation; HAVING filters after. Use WHERE when possible—it’s more efficient since it reduces rows before aggregation processing.
Joins
Joins combine rows from multiple tables based on related columns.
-- INNER JOIN - returns only matching rows
SELECT
o.order_id,
c.customer_name,
o.order_date,
o.order_total
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2024-01-01';
-- LEFT JOIN - returns all left table rows
SELECT
c.customer_name,
COUNT(o.order_id) AS order_count,
COALESCE(SUM(o.order_total), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name;
-- Multiple joins
SELECT
o.order_id,
c.customer_name,
p.product_name,
oi.quantity,
oi.unit_price
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '30 days';
Table aliases (o, c, p) improve readability. Always index foreign key columns used in JOIN conditions. Consider join order—most databases optimize automatically, but understanding cardinality helps.
Subqueries
Subqueries nest queries within queries, useful for complex filtering and calculations.
-- Subquery in WHERE clause
SELECT product_name, price
FROM products
WHERE price > (
SELECT AVG(price)
FROM products
);
-- Subquery with IN
SELECT customer_name, email
FROM customers
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM orders
WHERE order_date >= '2024-01-01'
);
-- Correlated subquery
SELECT
p.product_name,
p.price,
(SELECT AVG(price)
FROM products p2
WHERE p2.category = p.category) AS category_avg_price
FROM products p
WHERE p.price > (
SELECT AVG(price)
FROM products p3
WHERE p3.category = p.category
);
Correlated subqueries reference outer query columns and execute once per outer row—potentially expensive. Often rewritable as joins for better performance.
Common Table Expressions (CTEs)
CTEs create temporary named result sets, improving query readability and enabling recursive queries.
-- Basic CTE
WITH high_value_customers AS (
SELECT
customer_id,
SUM(order_total) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(order_total) > 5000
)
SELECT
c.customer_name,
c.email,
hvc.total_spent
FROM high_value_customers hvc
INNER JOIN customers c ON hvc.customer_id = c.customer_id
ORDER BY hvc.total_spent DESC;
-- Multiple CTEs
WITH monthly_sales AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(order_total) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
),
sales_with_growth AS (
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue
FROM monthly_sales
)
SELECT
month,
revenue,
prev_month_revenue,
ROUND(((revenue - prev_month_revenue) / prev_month_revenue * 100), 2) AS growth_pct
FROM sales_with_growth
WHERE prev_month_revenue IS NOT NULL;
CTEs execute once and materialize results (in most databases), unlike subqueries which may execute multiple times. Use them for complex queries requiring multiple references to the same derived data.
Window Functions
Window functions perform calculations across row sets without collapsing results like GROUP BY does.
-- Ranking functions
SELECT
product_name,
category,
price,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank,
RANK() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank_with_ties,
DENSE_RANK() OVER (PARTITION BY category ORDER BY price DESC) AS dense_rank
FROM products;
-- Aggregate window functions
SELECT
order_date,
order_total,
SUM(order_total) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total,
AVG(order_total) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7_days
FROM orders
ORDER BY order_date;
Window functions enable sophisticated analytical queries: running totals, moving averages, ranking, and lead/lag analysis without self-joins or correlated subqueries.
Set Operations
Set operations combine results from multiple SELECT statements.
-- UNION removes duplicates
SELECT customer_id, 'Premium' AS segment
FROM premium_customers
UNION
SELECT customer_id, 'Standard' AS segment
FROM standard_customers;
-- UNION ALL keeps duplicates (faster)
SELECT product_id, product_name FROM active_products
UNION ALL
SELECT product_id, product_name FROM archived_products;
-- INTERSECT - common rows
SELECT customer_id FROM orders_2023
INTERSECT
SELECT customer_id FROM orders_2024;
-- EXCEPT - rows in first but not second
SELECT customer_id FROM all_customers
EXCEPT
SELECT customer_id FROM churned_customers;
All SELECT statements must have the same number of columns with compatible data types. UNION ALL performs better than UNION—use it when duplicates don’t matter or can’t exist.