How to Use Window Functions in MySQL

Window functions perform calculations across a set of rows that are related to the current row, but unlike aggregate functions with GROUP BY, they don't collapse multiple rows into a single output...

Key Insights

  • Window functions calculate across rows without collapsing results like GROUP BY, letting you combine aggregates with row-level detail in a single query
  • MySQL 8.0+ supports window functions including ranking (ROW_NUMBER, RANK, DENSE_RANK), aggregates (SUM, AVG), and analytical functions (LAG, LEAD, NTILE)
  • Frame specifications (ROWS BETWEEN, RANGE BETWEEN) control which rows are included in calculations, enabling running totals, moving averages, and complex analytical queries

Introduction to Window Functions

Window functions perform calculations across a set of rows that are related to the current row, but unlike aggregate functions with GROUP BY, they don’t collapse multiple rows into a single output row. This means you can compute aggregates while preserving the detail of individual rows.

Here’s the fundamental difference:

-- Traditional aggregate: collapses to one row per department
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;

-- Window function: preserves all rows, adds aggregate column
SELECT 
    employee_name,
    department,
    salary,
    AVG(salary) OVER (PARTITION BY department) as dept_avg_salary
FROM employees;

The window function version gives you each employee’s row with their department’s average salary alongside their individual salary. This is invaluable for comparisons, rankings, and analytical queries where you need both detail and aggregate context.

Basic Window Function Syntax

The core syntax follows this pattern:

function_name() OVER (
    [PARTITION BY column1, column2, ...]
    [ORDER BY column3, column4, ...]
    [frame_specification]
)

The OVER clause defines the “window” of rows for the calculation:

  • PARTITION BY divides rows into groups (like GROUP BY, but doesn’t collapse results)
  • ORDER BY determines the sequence within each partition
  • Frame specification (optional) defines which rows within the partition to include

Here’s ROW_NUMBER() in action:

SELECT 
    employee_name,
    department,
    hire_date,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY hire_date) as hire_sequence
FROM employees;

This assigns a sequential number to employees within each department based on hire date.

Now compare ranking functions:

SELECT 
    employee_name,
    salary,
    RANK() OVER (ORDER BY salary DESC) as rank_with_gaps,
    DENSE_RANK() OVER (ORDER BY salary DESC) as rank_no_gaps
FROM employees;

If two employees have the same salary, RANK() gives them the same rank but skips the next number (1, 2, 2, 4), while DENSE_RANK() doesn’t skip (1, 2, 2, 3).

Common Ranking Functions

Each ranking function serves specific use cases:

ROW_NUMBER(): Always assigns unique sequential integers, even for ties. Use when you need distinct row identifiers or to select exactly N rows per group.

RANK(): Assigns the same rank to ties but leaves gaps. Use for competitive rankings where ties should “consume” positions (like Olympic medals).

DENSE_RANK(): Assigns the same rank to ties without gaps. Use when you want consecutive rank numbers regardless of ties.

NTILE(n): Divides rows into n roughly equal buckets. Use for quartiles, percentiles, or distributing data evenly.

Here’s a comprehensive example showing salary rankings within departments:

SELECT 
    employee_name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as row_num,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) as rank_val,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dense_rank_val,
    NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) as quartile
FROM employees;

This single query gives you multiple perspectives on salary distribution. The quartile column is particularly useful for identifying top 25% performers within each department.

Aggregate Window Functions

Standard aggregate functions become window functions when you add an OVER clause. This enables running totals, moving averages, and cumulative statistics.

Running total of sales:

SELECT 
    sale_date,
    amount,
    SUM(amount) OVER (ORDER BY sale_date) as running_total,
    COUNT(*) OVER (ORDER BY sale_date) as transaction_count
FROM sales
ORDER BY sale_date;

Moving average for smoothing data:

SELECT 
    month,
    revenue,
    AVG(revenue) OVER (
        ORDER BY month 
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) as three_month_avg
FROM monthly_revenue
ORDER BY month;

The ROWS BETWEEN 2 PRECEDING AND CURRENT ROW creates a sliding window of three rows (two before plus current), calculating the average across those rows as the window moves through the dataset.

Frame Specifications

Frame specifications give you precise control over which rows are included in window calculations. The syntax is:

{ROWS | RANGE} BETWEEN frame_start AND frame_end

Common frame boundaries:

  • UNBOUNDED PRECEDING: From the first row of the partition
  • N PRECEDING: N rows before the current row
  • CURRENT ROW: The current row
  • N FOLLOWING: N rows after the current row
  • UNBOUNDED FOLLOWING: To the last row of the partition

Here’s the difference between implicit and explicit frames:

-- Implicit frame: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;

-- Explicit frame: same result, but clearer intent
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (
        ORDER BY order_date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM orders;

ROWS counts physical rows, while RANGE considers logical ranges based on the ORDER BY value. Use ROWS for most cases; use RANGE when you want to include all rows with the same ORDER BY value.

Practical Use Cases

Month-over-month growth with LAG():

SELECT 
    month,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month) as prev_month_revenue,
    ROUND((revenue - LAG(revenue, 1) OVER (ORDER BY month)) / 
          LAG(revenue, 1) OVER (ORDER BY month) * 100, 2) as growth_pct
FROM monthly_revenue
ORDER BY month;

LAG(column, offset) accesses values from previous rows. LEAD() does the opposite, accessing future rows.

Top N per category:

SELECT *
FROM (
    SELECT 
        product_name,
        category,
        sales,
        ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) as rank_in_category
    FROM products
) ranked
WHERE rank_in_category <= 3;

This pattern is essential for “top N per group” queries, which are cumbersome with traditional SQL.

Percentile ranking:

SELECT 
    employee_name,
    salary,
    PERCENT_RANK() OVER (ORDER BY salary) as percentile,
    ROUND(PERCENT_RANK() OVER (ORDER BY salary) * 100, 1) as percentile_display
FROM employees;

PERCENT_RANK() returns a value between 0 and 1 representing the relative rank. Multiply by 100 for a percentage.

Performance Considerations and Best Practices

MySQL Version: Window functions require MySQL 8.0 or later. Verify your version with SELECT VERSION();

Indexing: Create indexes on columns used in PARTITION BY and ORDER BY clauses:

CREATE INDEX idx_dept_salary ON employees(department, salary DESC);

This index supports queries partitioning by department and ordering by salary.

When to use window functions vs. subqueries: Window functions are cleaner and often faster than correlated subqueries. Replace this:

-- Slow correlated subquery
SELECT 
    e1.employee_name,
    e1.salary,
    (SELECT AVG(e2.salary) 
     FROM employees e2 
     WHERE e2.department = e1.department) as dept_avg
FROM employees e1;

With this:

-- Faster window function
SELECT 
    employee_name,
    salary,
    AVG(salary) OVER (PARTITION BY department) as dept_avg
FROM employees;

Common pitfalls:

  1. Don’t use window functions in WHERE clauses: They execute after WHERE. Use a subquery or CTE:
-- Wrong
SELECT * FROM employees 
WHERE ROW_NUMBER() OVER (ORDER BY salary DESC) <= 10;

-- Correct
SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY salary DESC) as rn
    FROM employees
) ranked
WHERE rn <= 10;
  1. Avoid unnecessary calculations: If you reference the same window multiple times, use WINDOW clause:
SELECT 
    employee_name,
    salary,
    RANK() OVER w as rank_val,
    DENSE_RANK() OVER w as dense_rank_val
FROM employees
WINDOW w AS (ORDER BY salary DESC);
  1. Be explicit with frame specifications: Default frames can be surprising. When in doubt, specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW explicitly.

Window functions transform complex analytical queries from multi-step nightmares into elegant single statements. Master them, and you’ll write cleaner, faster SQL for ranking, running totals, time-series analysis, and comparative metrics. The investment in learning their syntax pays dividends every time you need to answer “show me X compared to Y” questions.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.