How to Use ROW_NUMBER in MySQL

ROW_NUMBER() is a window function introduced in MySQL 8.0 that assigns a unique sequential integer to each row within a result set. Unlike traditional aggregate functions that collapse rows, window...

Key Insights

  • ROW_NUMBER() assigns unique sequential integers to rows within a result set, making it essential for pagination, deduplication, and ranking tasks in MySQL 8.0+
  • The PARTITION BY clause resets row numbering for each group, enabling powerful per-category operations like finding top N records within departments or categories
  • While ROW_NUMBER() is versatile, understanding when to use RANK() or DENSE_RANK() instead—and how indexing affects performance—separates effective implementations from slow ones

Introduction to ROW_NUMBER()

ROW_NUMBER() is a window function introduced in MySQL 8.0 that assigns a unique sequential integer to each row within a result set. Unlike traditional aggregate functions that collapse rows, window functions perform calculations across rows while maintaining the individual row structure.

The primary purpose of ROW_NUMBER() is to generate sequential numbering based on a specific ordering criteria. This seemingly simple capability unlocks powerful solutions for common database challenges: implementing efficient pagination, identifying top performers within categories, removing duplicate records, and creating ranked lists.

Before MySQL 8.0, developers had to resort to user-defined variables or complex subqueries to achieve similar results. The introduction of window functions brought MySQL in line with other enterprise databases and dramatically simplified these operations.

Basic Syntax and Simple Examples

The fundamental syntax for ROW_NUMBER() consists of two parts: the function itself and the OVER clause that defines how rows are ordered:

ROW_NUMBER() OVER (ORDER BY column_name [ASC|DESC])

The OVER clause is mandatory and specifies the window over which the function operates. At minimum, you need an ORDER BY within the OVER clause to determine the sequence of row numbering.

Here’s a straightforward example using an employees table:

SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

This query assigns row numbers starting from 1, ordering employees by salary from highest to lowest. The employee with the highest salary gets row number 1, the second-highest gets 2, and so on. Even if two employees have identical salaries, each receives a unique row number—this deterministic behavior distinguishes ROW_NUMBER() from other ranking functions.

Using PARTITION BY for Grouped Numbering

The real power of ROW_NUMBER() emerges when you combine it with PARTITION BY. This clause divides the result set into groups, and row numbering resets to 1 for each partition.

The syntax expands to:

ROW_NUMBER() OVER (PARTITION BY column_name ORDER BY column_name)

Consider this example that numbers employees within each department:

SELECT 
    department_id,
    employee_id,
    first_name,
    last_name,
    hire_date,
    ROW_NUMBER() OVER (
        PARTITION BY department_id 
        ORDER BY hire_date ASC
    ) AS seniority_rank
FROM employees
ORDER BY department_id, seniority_rank;

This query assigns row numbers to employees based on their hire date, but the numbering resets for each department. The first person hired in Sales gets rank 1, and the first person hired in Engineering also gets rank 1. This partitioning capability is crucial for per-group analysis.

Practical Use Cases

Pagination

ROW_NUMBER() provides a robust alternative to LIMIT/OFFSET for pagination, especially when dealing with dynamic datasets or when you need consistent page boundaries.

WITH numbered_products AS (
    SELECT 
        product_id,
        product_name,
        price,
        category,
        ROW_NUMBER() OVER (ORDER BY product_id) AS row_num
    FROM products
)
SELECT 
    product_id,
    product_name,
    price,
    category
FROM numbered_products
WHERE row_num BETWEEN 21 AND 30;

This query retrieves page 3 of results (assuming 10 items per page). The CTE (Common Table Expression) assigns row numbers to all products, then the outer query filters for rows 21-30. This approach is more stable than OFFSET when underlying data changes between page requests.

Finding Top N Records Per Group

One of the most common use cases is identifying the top N records within each category—highest-paid employees per department, best-selling products per region, or most recent orders per customer.

WITH ranked_employees AS (
    SELECT 
        department_id,
        employee_id,
        first_name,
        last_name,
        salary,
        ROW_NUMBER() OVER (
            PARTITION BY department_id 
            ORDER BY salary DESC
        ) AS salary_rank
    FROM employees
)
SELECT 
    department_id,
    employee_id,
    first_name,
    last_name,
    salary
FROM ranked_employees
WHERE salary_rank <= 3
ORDER BY department_id, salary_rank;

This query returns the three highest-paid employees in each department. The PARTITION BY ensures rankings are calculated independently for each department, and the WHERE clause filters for only the top 3 positions.

Removing Duplicates

ROW_NUMBER() excels at identifying and removing duplicate records while retaining control over which record to keep.

WITH duplicate_customers AS (
    SELECT 
        customer_id,
        email,
        created_at,
        ROW_NUMBER() OVER (
            PARTITION BY email 
            ORDER BY created_at DESC
        ) AS row_num
    FROM customers
)
DELETE FROM customers
WHERE customer_id IN (
    SELECT customer_id 
    FROM duplicate_customers 
    WHERE row_num > 1
);

This query identifies duplicate customers by email address and deletes all but the most recent record (highest created_at timestamp). The PARTITION BY email groups duplicates together, and ORDER BY created_at DESC ensures the most recent record gets row_num = 1.

ROW_NUMBER() vs RANK() vs DENSE_RANK()

MySQL provides three ranking window functions, each handling tied values differently. Understanding these differences is critical for choosing the right function.

SELECT 
    player_name,
    score,
    ROW_NUMBER() OVER (ORDER BY score DESC) AS row_number,
    RANK() OVER (ORDER BY score DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank
FROM leaderboard
ORDER BY score DESC;

Given these scores:

  • Alice: 100
  • Bob: 95
  • Carol: 95
  • David: 90

The results would be:

player_name score row_number rank dense_rank
Alice 100 1 1 1
Bob 95 2 2 2
Carol 95 3 2 2
David 90 4 4 3

ROW_NUMBER() assigns unique numbers regardless of ties. Bob and Carol both get different row numbers even though they have the same score.

RANK() assigns the same rank to tied values but skips subsequent ranks. Both Bob and Carol get rank 2, so David gets rank 4 (skipping rank 3).

DENSE_RANK() also assigns the same rank to ties but doesn’t skip ranks. Bob and Carol get rank 2, and David gets rank 3.

Use ROW_NUMBER() when you need unique identifiers for every row, RANK() for traditional competition-style rankings, and DENSE_RANK() when you want consecutive rank values without gaps.

Performance Considerations and Best Practices

Window functions can be resource-intensive on large datasets. Here are key performance considerations:

Indexing matters. Create indexes on columns used in ORDER BY and PARTITION BY clauses. For the employees example, an index on (department_id, salary DESC) would significantly improve performance:

CREATE INDEX idx_dept_salary ON employees(department_id, salary DESC);

ROW_NUMBER() vs LIMIT/OFFSET. For simple pagination on static datasets, LIMIT/OFFSET is faster. Use ROW_NUMBER() when you need:

  • Consistent results across multiple queries
  • Complex partitioning logic
  • To combine pagination with other window functions

Avoid unnecessary sorting. If your data is already ordered by the clustering key, align your ORDER BY clause with that ordering to leverage existing sort order.

Consider materialized results. For frequently accessed rankings that don’t change often, consider storing ROW_NUMBER() results in a materialized table and refreshing periodically rather than calculating on every query.

Watch memory usage. Window functions may require temporary tables for sorting and partitioning. Monitor your tmp_table_size and max_heap_table_size settings for large result sets.

Conclusion

ROW_NUMBER() is an indispensable tool in modern MySQL development. It simplifies complex ranking and numbering operations that previously required convoluted subqueries or application-level processing. Master the basic syntax with ORDER BY, leverage PARTITION BY for grouped operations, and understand the distinctions between ROW_NUMBER(), RANK(), and DENSE_RANK() to choose the right tool for each scenario.

Start with simple examples in your development environment, then progress to more complex use cases like pagination and deduplication. Pay attention to performance implications on production datasets, and always index appropriately. The investment in learning window functions pays dividends in cleaner, more maintainable SQL code.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.