SQL Interview Questions and Answers (Top 50)
SQL remains the lingua franca of data. Whether you're interviewing for a backend role, data engineering position, or even some frontend jobs that touch databases, you'll face SQL questions. This...
Key Insights
- SQL interviews test pattern recognition more than memorization—learn the underlying concepts behind common query patterns like self-joins, window functions, and CTEs
- Most interview questions fall into predictable categories: ranking, finding duplicates, handling NULLs, and comparing rows—master these patterns and you’ll handle 80% of questions
- Performance questions separate senior candidates from juniors—understand indexing basics and how to read execution plans even if you’re not a DBA
SQL remains the lingua franca of data. Whether you’re interviewing for a backend role, data engineering position, or even some frontend jobs that touch databases, you’ll face SQL questions. This guide covers the 50 most common questions organized by difficulty, with working code examples you can practice.
How to Use This Guide
Questions progress from basic SELECT statements to advanced window functions and optimization. If you’re short on time, focus on joins (questions 16-25) and aggregations (26-32)—these appear in nearly every interview. For senior roles, window functions and CTEs are non-negotiable.
Basic SQL Questions (1-15)
Q1: What’s the difference between WHERE and HAVING?
WHERE filters rows before grouping; HAVING filters after aggregation.
-- WHERE filters individual rows
SELECT department, salary
FROM employees
WHERE salary > 50000;
-- HAVING filters grouped results
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
Q2: How do you handle NULL values?
NULL represents unknown data. You can’t use = to compare NULLs—use IS NULL or IS NOT NULL.
-- Wrong: This returns nothing even if NULLs exist
SELECT * FROM employees WHERE manager_id = NULL;
-- Correct
SELECT * FROM employees WHERE manager_id IS NULL;
-- COALESCE returns the first non-NULL value
SELECT name, COALESCE(phone, email, 'No contact') as contact
FROM employees;
Q3: What’s the order of SQL clause execution?
This trips up many candidates. Logical execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT.
Q4: Write a query to get the second highest salary.
-- Using LIMIT/OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;
-- Using subquery (more portable)
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Q5-15 cover: SELECT DISTINCT, BETWEEN, IN, LIKE with wildcards, aliasing, basic date functions, UNION vs UNION ALL, and CASE statements.
-- CASE for conditional logic
SELECT name,
CASE
WHEN salary > 100000 THEN 'Senior'
WHEN salary > 50000 THEN 'Mid'
ELSE 'Junior'
END as level
FROM employees;
Joins and Relationships (16-25)
Q16: Explain the difference between JOIN types.
-- INNER JOIN: Only matching rows from both tables
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id;
-- LEFT JOIN: All rows from left table, matching from right
SELECT e.name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id;
-- RIGHT JOIN: All rows from right table, matching from left
-- FULL OUTER JOIN: All rows from both tables
Q17: Find employees without a department (unmatched records).
SELECT e.name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.id
WHERE d.id IS NULL;
Q18: Write a self-join to find employees and their managers.
Self-joins are common interview questions. You’re joining a table to itself.
SELECT
e.name as employee,
m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Q19: Find all pairs of employees in the same department.
SELECT
e1.name as employee1,
e2.name as employee2,
e1.dept_id
FROM employees e1
JOIN employees e2 ON e1.dept_id = e2.dept_id AND e1.id < e2.id;
The e1.id < e2.id prevents duplicate pairs (A,B) and (B,A).
Q20-25 cover: CROSS JOIN for cartesian products, joining more than two tables, handling many-to-many relationships through junction tables, and anti-joins using NOT EXISTS.
Aggregations and Grouping (26-32)
Q26: Find duplicate email addresses.
SELECT email, COUNT(*) as count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
Q27: Get total sales by region, showing only regions with sales over $10,000.
SELECT
region,
SUM(amount) as total_sales,
COUNT(*) as num_orders,
AVG(amount) as avg_order
FROM orders
GROUP BY region
HAVING SUM(amount) > 10000
ORDER BY total_sales DESC;
Q28: Count orders by status with conditional aggregation.
SELECT
customer_id,
COUNT(*) as total_orders,
SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed,
SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) as pending,
SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) as cancelled
FROM orders
GROUP BY customer_id;
Q29-32 cover: GROUP BY with multiple columns, using aggregate functions with DISTINCT, calculating percentages within groups, and filtering with aggregate conditions.
Subqueries and CTEs (33-40)
Q33: What’s the difference between correlated and non-correlated subqueries?
Non-correlated subqueries execute once. Correlated subqueries execute for each row in the outer query.
-- Non-correlated: Subquery runs once
SELECT * FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
-- Correlated: Subquery runs for each employee
SELECT * FROM employees e1
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e2.dept_id = e1.dept_id
);
Q34: Find the nth highest salary using a CTE.
WITH ranked_salaries AS (
SELECT
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) as rank
FROM employees
)
SELECT salary
FROM ranked_salaries
WHERE rank = 3; -- 3rd highest
Q35: Build an org chart with recursive CTE.
WITH RECURSIVE org_chart AS (
-- Base case: top-level employees (no manager)
SELECT id, name, manager_id, 1 as level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees with managers
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart ORDER BY level, name;
Q36: EXISTS vs IN—when to use which?
-- EXISTS is typically faster for large subquery results
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
-- IN is cleaner for small, known value sets
SELECT * FROM customers
WHERE country IN ('USA', 'Canada', 'Mexico');
Q37-40 cover: Subqueries in SELECT clause, using CTEs for readability, DELETE with subqueries, and UPDATE using joined data.
Window Functions (41-46)
Q41: Explain ROW_NUMBER vs RANK vs DENSE_RANK.
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) as row_num,
RANK() OVER (ORDER BY salary DESC) as rank,
DENSE_RANK() OVER (ORDER BY salary DESC) as dense_rank
FROM employees;
-- If two people tie for 2nd: ROW_NUMBER gives 2,3; RANK gives 2,2,4; DENSE_RANK gives 2,2,3
Q42: Get the top 3 earners per department.
WITH ranked AS (
SELECT
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn
FROM employees
)
SELECT * FROM ranked WHERE rn <= 3;
Q43: Calculate running total of sales.
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;
Q44: Compare each row to the previous row.
SELECT
order_date,
amount,
LAG(amount, 1) OVER (ORDER BY order_date) as prev_amount,
amount - LAG(amount, 1) OVER (ORDER BY order_date) as difference
FROM orders;
Q45-46 cover: LEAD for looking ahead, FIRST_VALUE/LAST_VALUE, and moving averages with frame clauses.
Advanced Topics & Performance (47-50)
Q47: When should you add an index?
-- Add indexes on columns used in WHERE, JOIN, and ORDER BY
CREATE INDEX idx_employees_dept ON employees(dept_id);
-- Composite index for queries filtering on multiple columns
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
Q48: How do you read an EXPLAIN plan?
Look for: sequential scans on large tables (bad), index usage (good), and estimated row counts that seem wrong.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;
Q49: What are ACID properties?
Atomicity (all or nothing), Consistency (valid state to valid state), Isolation (concurrent transactions don’t interfere), Durability (committed data survives crashes).
Q50: Explain normalization forms.
1NF: No repeating groups. 2NF: No partial dependencies. 3NF: No transitive dependencies. In practice, most applications use 3NF with strategic denormalization for performance.
Quick Reference Cheat Sheet
| Pattern | Solution |
|---|---|
| Find duplicates | GROUP BY + HAVING COUNT(*) > 1 |
| Nth highest | DENSE_RANK() or LIMIT/OFFSET |
| Running total | SUM() OVER (ORDER BY …) |
| Row comparison | LAG()/LEAD() |
| Hierarchical data | Recursive CTE |
| Unmatched records | LEFT JOIN + WHERE IS NULL |
Common mistakes to avoid: Using = NULL instead of IS NULL, forgetting GROUP BY columns in SELECT, not handling ties in ranking questions, and using SELECT * in production queries.
Practice these patterns until they’re automatic. SQL interviews reward pattern recognition over raw memorization.