SQL - Complete Tutorial for Beginners
SQL (Structured Query Language) is the standard language for interacting with relational databases. Unlike procedural programming languages, SQL is declarative—you describe the result you want, and...
Key Insights
- SQL is a declarative language where you specify what data you want, not how to retrieve it—the database engine optimizes query execution automatically
- Master the four core operations (SELECT, INSERT, UPDATE, DELETE) and JOIN types to handle 90% of real-world database tasks
- Understanding WHERE clauses, aggregation functions, and subqueries unlocks advanced data manipulation and reporting capabilities
Understanding SQL Basics
SQL (Structured Query Language) is the standard language for interacting with relational databases. Unlike procedural programming languages, SQL is declarative—you describe the result you want, and the database figures out how to get it.
Every SQL statement falls into one of several categories: Data Query Language (DQL) for retrieving data, Data Manipulation Language (DML) for modifying data, Data Definition Language (DDL) for creating structures, and Data Control Language (DCL) for permissions.
-- Creating a simple table (DDL)
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
department VARCHAR(50),
salary DECIMAL(10, 2),
hire_date DATE
);
-- Inserting data (DML)
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES
(1, 'Alice Johnson', 'Engineering', 95000.00, '2020-03-15'),
(2, 'Bob Smith', 'Marketing', 72000.00, '2019-07-22'),
(3, 'Carol White', 'Engineering', 88000.00, '2021-01-10');
SELECT Statements and Filtering
The SELECT statement retrieves data from tables. The WHERE clause filters rows based on conditions, and you can combine multiple conditions using AND, OR, and NOT operators.
-- Basic SELECT
SELECT name, department, salary
FROM employees;
-- Filtering with WHERE
SELECT name, salary
FROM employees
WHERE department = 'Engineering';
-- Multiple conditions
SELECT name, salary
FROM employees
WHERE department = 'Engineering'
AND salary > 90000;
-- Pattern matching with LIKE
SELECT name
FROM employees
WHERE name LIKE 'A%'; -- Names starting with 'A'
-- IN operator for multiple values
SELECT name, department
FROM employees
WHERE department IN ('Engineering', 'Marketing', 'Sales');
-- Range filtering
SELECT name, salary
FROM employees
WHERE salary BETWEEN 70000 AND 90000;
Sorting and Limiting Results
Control the order and quantity of returned data using ORDER BY and LIMIT clauses. These are essential for pagination and presenting data in meaningful ways.
-- Sorting results
SELECT name, salary
FROM employees
ORDER BY salary DESC; -- Highest to lowest
-- Multiple sort columns
SELECT name, department, salary
FROM employees
ORDER BY department ASC, salary DESC;
-- Limiting results (MySQL/PostgreSQL syntax)
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
-- Pagination with OFFSET
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 10 OFFSET 20; -- Skip first 20, return next 10
Aggregate Functions and Grouping
Aggregate functions perform calculations across multiple rows. The GROUP BY clause groups rows sharing common values, enabling per-group aggregations.
-- Basic aggregations
SELECT
COUNT(*) as total_employees,
AVG(salary) as average_salary,
MAX(salary) as highest_salary,
MIN(salary) as lowest_salary,
SUM(salary) as total_payroll
FROM employees;
-- Grouping data
SELECT
department,
COUNT(*) as employee_count,
AVG(salary) as avg_salary
FROM employees
GROUP BY department;
-- Filtering groups with HAVING
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000;
-- Complex grouping
SELECT
department,
EXTRACT(YEAR FROM hire_date) as hire_year,
COUNT(*) as hires
FROM employees
GROUP BY department, EXTRACT(YEAR FROM hire_date)
ORDER BY department, hire_year;
JOIN Operations
JOINs combine data from multiple tables based on related columns. Understanding JOIN types is critical for working with normalized databases.
-- Setup additional tables
CREATE TABLE departments (
dept_id INT PRIMARY KEY,
dept_name VARCHAR(50),
manager_id INT
);
CREATE TABLE projects (
project_id INT PRIMARY KEY,
project_name VARCHAR(100),
employee_id INT,
budget DECIMAL(12, 2)
);
-- INNER JOIN: Returns only matching rows
SELECT
e.name,
d.dept_name
FROM employees e
INNER JOIN departments d ON e.department = d.dept_name;
-- LEFT JOIN: All rows from left table, matching from right
SELECT
e.name,
p.project_name
FROM employees e
LEFT JOIN projects p ON e.id = p.employee_id;
-- Multiple JOINs
SELECT
e.name as employee_name,
d.dept_name,
p.project_name,
p.budget
FROM employees e
INNER JOIN departments d ON e.department = d.dept_name
LEFT JOIN projects p ON e.id = p.employee_id;
-- Self JOIN: Joining a table to itself
SELECT
e.name as employee,
m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Subqueries and Derived Tables
Subqueries are queries nested inside other queries. They enable complex filtering and calculations that would be difficult with JOINs alone.
-- Subquery in WHERE clause
SELECT name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
-- Subquery with IN
SELECT name, department
FROM employees
WHERE department IN (
SELECT dept_name
FROM departments
WHERE budget > 1000000
);
-- Correlated subquery
SELECT
e1.name,
e1.salary,
e1.department
FROM employees e1
WHERE salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.department = e1.department
);
-- Derived table (subquery in FROM)
SELECT
dept_stats.department,
dept_stats.avg_salary,
e.name
FROM (
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
) dept_stats
JOIN employees e ON e.department = dept_stats.department
WHERE e.salary > dept_stats.avg_salary;
Modifying Data
INSERT, UPDATE, and DELETE statements modify table data. Always use WHERE clauses with UPDATE and DELETE to avoid unintended changes.
-- Inserting single row
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES (4, 'David Brown', 'Sales', 68000.00, '2022-05-01');
-- Inserting multiple rows
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES
(5, 'Emma Davis', 'HR', 75000.00, '2021-09-15'),
(6, 'Frank Miller', 'Engineering', 92000.00, '2020-11-03');
-- Updating records
UPDATE employees
SET salary = salary * 1.05
WHERE department = 'Engineering';
-- Updating with conditions
UPDATE employees
SET department = 'Senior Engineering'
WHERE department = 'Engineering'
AND salary > 90000;
-- Deleting records
DELETE FROM employees
WHERE hire_date < '2019-01-01';
-- Safe delete with SELECT verification first
SELECT * FROM employees WHERE department = 'Marketing';
-- Then delete if results are correct
DELETE FROM employees WHERE department = 'Marketing';
Common Table Expressions (CTEs)
CTEs make complex queries more readable by breaking them into named, reusable components. They’re especially useful for recursive queries and multi-step transformations.
-- Basic CTE
WITH high_earners AS (
SELECT name, department, salary
FROM employees
WHERE salary > 85000
)
SELECT
department,
COUNT(*) as high_earner_count,
AVG(salary) as avg_high_salary
FROM high_earners
GROUP BY department;
-- Multiple CTEs
WITH
dept_averages AS (
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
),
above_average AS (
SELECT
e.name,
e.department,
e.salary,
da.avg_salary
FROM employees e
JOIN dept_averages da ON e.department = da.department
WHERE e.salary > da.avg_salary
)
SELECT * FROM above_average
ORDER BY department, salary DESC;
Window Functions
Window functions perform calculations across sets of rows related to the current row, without collapsing the result set like GROUP BY does.
-- Ranking employees by salary within department
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM employees;
-- Running totals
SELECT
name,
hire_date,
salary,
SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;
-- Moving averages
SELECT
name,
hire_date,
salary,
AVG(salary) OVER (
ORDER BY hire_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) as moving_avg
FROM employees;
SQL mastery comes from practice. Start with simple SELECT statements, gradually incorporate JOINs and aggregations, then tackle subqueries and window functions. Each concept builds on the previous, creating a powerful toolkit for data manipulation.