How to Use Window Functions in SQLite

Key Insights

Window functions perform calculations across rows without collapsing results like GROUP BY does, making them essential for rankings, running totals, and row-to-row comparisons in a single query
SQLite has supported window functions since version 3.25.0 (2018), providing all standard SQL window functions including ROW_NUMBER(), LAG(), LEAD(), and aggregate functions with OVER() clauses
Proper use of PARTITION BY and frame specifications (ROWS BETWEEN) allows you to control exactly which rows participate in each calculation, enabling complex analytics that would otherwise require self-joins or subqueries

Introduction to Window Functions

Window functions transform how you write analytical queries in SQLite. Unlike aggregate functions that collapse multiple rows into a single result, window functions calculate values across a set of rows while preserving each individual row in your result set.

Consider calculating a running total of sales. With traditional GROUP BY, you’d need multiple queries or a self-join. Window functions solve this elegantly:

-- Traditional approach: requires self-join or subquery
SELECT 
    date,
    amount,
    (SELECT SUM(amount) 
     FROM sales s2 
     WHERE s2.date <= s1.date) AS running_total
FROM sales s1;

-- Window function approach: clean and efficient
SELECT 
    date,
    amount,
    SUM(amount) OVER (ORDER BY date) AS running_total
FROM sales;

The window function version is not only more readable but also typically more efficient. SQLite processes the window function in a single pass rather than executing a correlated subquery for each row.

Basic Window Function Syntax

Every window function uses the OVER() clause to define its “window” - the set of rows used for calculation. The basic syntax includes three optional components:

function_name() OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression]
    [frame_specification]
)

Here’s how each component works:

-- Basic window: operates on entire result set
SELECT 
    product_name,
    price,
    AVG(price) OVER () AS avg_price
FROM products;

-- PARTITION BY: separate windows per category
SELECT 
    category,
    product_name,
    price,
    AVG(price) OVER (PARTITION BY category) AS category_avg_price
FROM products;

-- ORDER BY: creates ordered frame for calculations
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) AS cumulative_total
FROM orders;

Frame specifications control which rows within the window participate in the calculation:

-- Default frame: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
SELECT 
    date,
    sales,
    SUM(sales) OVER (
        ORDER BY date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM daily_sales;

-- Moving average: last 7 days including current
SELECT 
    date,
    sales,
    AVG(sales) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day
FROM daily_sales;

Ranking Functions

Ranking functions assign positions to rows based on ordering criteria. SQLite provides four ranking functions, each with different behavior for ties:

CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    name TEXT,
    department TEXT,
    salary INTEGER
);

-- ROW_NUMBER: unique sequential number (arbitrary order for ties)
-- RANK: same rank for ties, gaps in sequence
-- DENSE_RANK: same rank for ties, no gaps
-- NTILE: divides rows into N buckets

SELECT 
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank,
    NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile
FROM employees;

A practical use case is finding the top N items per category:

-- Top 3 highest-paid employees per department
WITH ranked_employees AS (
    SELECT 
        name,
        department,
        salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
)
SELECT name, department, salary
FROM ranked_employees
WHERE rn <= 3;

NTILE() is particularly useful for percentile analysis:

-- Categorize products into price quartiles
SELECT 
    product_name,
    price,
    NTILE(4) OVER (ORDER BY price) AS price_quartile,
    CASE NTILE(4) OVER (ORDER BY price)
        WHEN 1 THEN 'Budget'
        WHEN 2 THEN 'Economy'
        WHEN 3 THEN 'Premium'
        WHEN 4 THEN 'Luxury'
    END AS price_category
FROM products;

Aggregate Window Functions

Standard aggregate functions (SUM, AVG, COUNT, MIN, MAX) become window functions when you add an OVER() clause. This enables running calculations and moving windows:

-- Running total with cumulative count
SELECT 
    order_date,
    order_id,
    amount,
    SUM(amount) OVER (ORDER BY order_date, order_id) AS running_total,
    COUNT(*) OVER (ORDER BY order_date, order_id) AS order_count,
    AVG(amount) OVER (ORDER BY order_date, order_id) AS running_avg
FROM orders;

Moving windows are essential for time-series analysis:

-- 7-day moving average and 30-day moving sum
SELECT 
    date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day,
    SUM(daily_revenue) OVER (
        ORDER BY date 
        ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
    ) AS moving_sum_30day
FROM daily_revenue
ORDER BY date;

You can combine PARTITION BY with frame specifications for grouped moving calculations:

-- Moving average per product category
SELECT 
    date,
    category,
    sales,
    AVG(sales) OVER (
        PARTITION BY category 
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS category_moving_avg
FROM product_sales;

Value Functions

Value functions access data from other rows relative to the current row. LAG() and LEAD() are the most commonly used:

-- Month-over-month comparison
SELECT 
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS previous_month,
    revenue - LAG(revenue) OVER (ORDER BY month) AS mom_change,
    ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month)) / 
          LAG(revenue) OVER (ORDER BY month), 2) AS mom_change_pct
FROM monthly_revenue;

LEAD() looks forward instead of backward:

-- Compare current price with next price change
SELECT 
    product_id,
    effective_date,
    price,
    LEAD(price) OVER (PARTITION BY product_id ORDER BY effective_date) AS next_price,
    LEAD(effective_date) OVER (PARTITION BY product_id ORDER BY effective_date) AS next_change_date
FROM price_history;

FIRST_VALUE() and LAST_VALUE() extract boundary values from windows:

-- Compare each sale to first and last sale of the day
SELECT 
    sale_time,
    amount,
    FIRST_VALUE(amount) OVER (
        PARTITION BY DATE(sale_time) 
        ORDER BY sale_time
    ) AS first_sale_of_day,
    LAST_VALUE(amount) OVER (
        PARTITION BY DATE(sale_time) 
        ORDER BY sale_time
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS last_sale_of_day
FROM sales;

Note the frame specification for LAST_VALUE() - without it, the default frame only extends to the current row, not the actual last row.

Practical Use Cases

Window functions shine in real-world analytics. Here’s a comprehensive sales dashboard query:

WITH sales_analytics AS (
    SELECT 
        product_id,
        product_name,
        category,
        sale_date,
        quantity,
        revenue,
        -- Rankings
        RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS category_rank,
        -- Running totals
        SUM(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS cumulative_revenue,
        -- Period comparisons
        LAG(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS prev_period_revenue,
        -- Moving averages
        AVG(revenue) OVER (
            PARTITION BY product_id 
            ORDER BY sale_date 
            ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
        ) AS moving_avg_7periods,
        -- Percentage of category total
        100.0 * revenue / SUM(revenue) OVER (PARTITION BY category) AS pct_of_category
    FROM sales
)
SELECT * FROM sales_analytics
WHERE category_rank <= 10
ORDER BY category, category_rank;

Deduplication is another practical application:

-- Keep only the most recent record per customer
DELETE FROM customer_data
WHERE rowid NOT IN (
    SELECT rowid
    FROM (
        SELECT 
            rowid,
            ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY updated_at DESC) AS rn
        FROM customer_data
    )
    WHERE rn = 1
);

Performance Considerations

Window functions are efficient, but understanding their performance characteristics helps you write better queries:

Indexing matters: Create indexes on columns used in PARTITION BY and ORDER BY clauses:

-- This query benefits from an index
CREATE INDEX idx_sales_category_date ON sales(category, sale_date);

SELECT 
    category,
    sale_date,
    revenue,
    SUM(revenue) OVER (PARTITION BY category ORDER BY sale_date) AS running_total
FROM sales;

Avoid redundant window definitions: Use the WINDOW clause to define windows once:

-- Inefficient: repeating window definition
SELECT 
    product_id,
    SUM(revenue) OVER (PARTITION BY category ORDER BY sale_date),
    AVG(revenue) OVER (PARTITION BY category ORDER BY sale_date)
FROM sales;

-- Better: define window once
SELECT 
    product_id,
    SUM(revenue) OVER w,
    AVG(revenue) OVER w
FROM sales
WINDOW w AS (PARTITION BY category ORDER BY sale_date);

Consider materialization for complex windows: If you’re repeatedly querying the same window calculations, materialize them:

CREATE TABLE sales_with_analytics AS
SELECT 
    *,
    SUM(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS cumulative_revenue,
    RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS category_rank
FROM sales;

CREATE INDEX idx_analytics ON sales_with_analytics(category, category_rank);

Window functions are a powerful tool in SQLite’s analytical arsenal. They eliminate complex self-joins, enable elegant time-series analysis, and make ranking queries straightforward. Master the syntax, understand frame specifications, and you’ll write cleaner, more efficient SQL.