SQL - Complete Tutorial for Beginners

SQL (Structured Query Language) is the standard language for interacting with relational databases. Unlike procedural programming languages, SQL is declarative—you describe the result you want, and...

Key Insights

  • SQL is a declarative language where you specify what data you want, not how to retrieve it—the database engine optimizes query execution automatically
  • Master the four core operations (SELECT, INSERT, UPDATE, DELETE) and JOIN types to handle 90% of real-world database tasks
  • Understanding WHERE clauses, aggregation functions, and subqueries unlocks advanced data manipulation and reporting capabilities

Understanding SQL Basics

SQL (Structured Query Language) is the standard language for interacting with relational databases. Unlike procedural programming languages, SQL is declarative—you describe the result you want, and the database figures out how to get it.

Every SQL statement falls into one of several categories: Data Query Language (DQL) for retrieving data, Data Manipulation Language (DML) for modifying data, Data Definition Language (DDL) for creating structures, and Data Control Language (DCL) for permissions.

-- Creating a simple table (DDL)
CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department VARCHAR(50),
    salary DECIMAL(10, 2),
    hire_date DATE
);

-- Inserting data (DML)
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES 
    (1, 'Alice Johnson', 'Engineering', 95000.00, '2020-03-15'),
    (2, 'Bob Smith', 'Marketing', 72000.00, '2019-07-22'),
    (3, 'Carol White', 'Engineering', 88000.00, '2021-01-10');

SELECT Statements and Filtering

The SELECT statement retrieves data from tables. The WHERE clause filters rows based on conditions, and you can combine multiple conditions using AND, OR, and NOT operators.

-- Basic SELECT
SELECT name, department, salary
FROM employees;

-- Filtering with WHERE
SELECT name, salary
FROM employees
WHERE department = 'Engineering';

-- Multiple conditions
SELECT name, salary
FROM employees
WHERE department = 'Engineering' 
    AND salary > 90000;

-- Pattern matching with LIKE
SELECT name
FROM employees
WHERE name LIKE 'A%';  -- Names starting with 'A'

-- IN operator for multiple values
SELECT name, department
FROM employees
WHERE department IN ('Engineering', 'Marketing', 'Sales');

-- Range filtering
SELECT name, salary
FROM employees
WHERE salary BETWEEN 70000 AND 90000;

Sorting and Limiting Results

Control the order and quantity of returned data using ORDER BY and LIMIT clauses. These are essential for pagination and presenting data in meaningful ways.

-- Sorting results
SELECT name, salary
FROM employees
ORDER BY salary DESC;  -- Highest to lowest

-- Multiple sort columns
SELECT name, department, salary
FROM employees
ORDER BY department ASC, salary DESC;

-- Limiting results (MySQL/PostgreSQL syntax)
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;

-- Pagination with OFFSET
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 10 OFFSET 20;  -- Skip first 20, return next 10

Aggregate Functions and Grouping

Aggregate functions perform calculations across multiple rows. The GROUP BY clause groups rows sharing common values, enabling per-group aggregations.

-- Basic aggregations
SELECT 
    COUNT(*) as total_employees,
    AVG(salary) as average_salary,
    MAX(salary) as highest_salary,
    MIN(salary) as lowest_salary,
    SUM(salary) as total_payroll
FROM employees;

-- Grouping data
SELECT 
    department,
    COUNT(*) as employee_count,
    AVG(salary) as avg_salary
FROM employees
GROUP BY department;

-- Filtering groups with HAVING
SELECT 
    department,
    AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000;

-- Complex grouping
SELECT 
    department,
    EXTRACT(YEAR FROM hire_date) as hire_year,
    COUNT(*) as hires
FROM employees
GROUP BY department, EXTRACT(YEAR FROM hire_date)
ORDER BY department, hire_year;

JOIN Operations

JOINs combine data from multiple tables based on related columns. Understanding JOIN types is critical for working with normalized databases.

-- Setup additional tables
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50),
    manager_id INT
);

CREATE TABLE projects (
    project_id INT PRIMARY KEY,
    project_name VARCHAR(100),
    employee_id INT,
    budget DECIMAL(12, 2)
);

-- INNER JOIN: Returns only matching rows
SELECT 
    e.name,
    d.dept_name
FROM employees e
INNER JOIN departments d ON e.department = d.dept_name;

-- LEFT JOIN: All rows from left table, matching from right
SELECT 
    e.name,
    p.project_name
FROM employees e
LEFT JOIN projects p ON e.id = p.employee_id;

-- Multiple JOINs
SELECT 
    e.name as employee_name,
    d.dept_name,
    p.project_name,
    p.budget
FROM employees e
INNER JOIN departments d ON e.department = d.dept_name
LEFT JOIN projects p ON e.id = p.employee_id;

-- Self JOIN: Joining a table to itself
SELECT 
    e.name as employee,
    m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;

Subqueries and Derived Tables

Subqueries are queries nested inside other queries. They enable complex filtering and calculations that would be difficult with JOINs alone.

-- Subquery in WHERE clause
SELECT name, salary
FROM employees
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
);

-- Subquery with IN
SELECT name, department
FROM employees
WHERE department IN (
    SELECT dept_name
    FROM departments
    WHERE budget > 1000000
);

-- Correlated subquery
SELECT 
    e1.name,
    e1.salary,
    e1.department
FROM employees e1
WHERE salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e1.department
);

-- Derived table (subquery in FROM)
SELECT 
    dept_stats.department,
    dept_stats.avg_salary,
    e.name
FROM (
    SELECT 
        department,
        AVG(salary) as avg_salary
    FROM employees
    GROUP BY department
) dept_stats
JOIN employees e ON e.department = dept_stats.department
WHERE e.salary > dept_stats.avg_salary;

Modifying Data

INSERT, UPDATE, and DELETE statements modify table data. Always use WHERE clauses with UPDATE and DELETE to avoid unintended changes.

-- Inserting single row
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES (4, 'David Brown', 'Sales', 68000.00, '2022-05-01');

-- Inserting multiple rows
INSERT INTO employees (id, name, department, salary, hire_date)
VALUES 
    (5, 'Emma Davis', 'HR', 75000.00, '2021-09-15'),
    (6, 'Frank Miller', 'Engineering', 92000.00, '2020-11-03');

-- Updating records
UPDATE employees
SET salary = salary * 1.05
WHERE department = 'Engineering';

-- Updating with conditions
UPDATE employees
SET department = 'Senior Engineering'
WHERE department = 'Engineering' 
    AND salary > 90000;

-- Deleting records
DELETE FROM employees
WHERE hire_date < '2019-01-01';

-- Safe delete with SELECT verification first
SELECT * FROM employees WHERE department = 'Marketing';
-- Then delete if results are correct
DELETE FROM employees WHERE department = 'Marketing';

Common Table Expressions (CTEs)

CTEs make complex queries more readable by breaking them into named, reusable components. They’re especially useful for recursive queries and multi-step transformations.

-- Basic CTE
WITH high_earners AS (
    SELECT name, department, salary
    FROM employees
    WHERE salary > 85000
)
SELECT 
    department,
    COUNT(*) as high_earner_count,
    AVG(salary) as avg_high_salary
FROM high_earners
GROUP BY department;

-- Multiple CTEs
WITH 
dept_averages AS (
    SELECT 
        department,
        AVG(salary) as avg_salary
    FROM employees
    GROUP BY department
),
above_average AS (
    SELECT 
        e.name,
        e.department,
        e.salary,
        da.avg_salary
    FROM employees e
    JOIN dept_averages da ON e.department = da.department
    WHERE e.salary > da.avg_salary
)
SELECT * FROM above_average
ORDER BY department, salary DESC;

Window Functions

Window functions perform calculations across sets of rows related to the current row, without collapsing the result set like GROUP BY does.

-- Ranking employees by salary within department
SELECT 
    name,
    department,
    salary,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_rank
FROM employees;

-- Running totals
SELECT 
    name,
    hire_date,
    salary,
    SUM(salary) OVER (ORDER BY hire_date) as running_total
FROM employees;

-- Moving averages
SELECT 
    name,
    hire_date,
    salary,
    AVG(salary) OVER (
        ORDER BY hire_date 
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) as moving_avg
FROM employees;

SQL mastery comes from practice. Start with simple SELECT statements, gradually incorporate JOINs and aggregations, then tackle subqueries and window functions. Each concept builds on the previous, creating a powerful toolkit for data manipulation.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.