How to Use Window Functions in SQLite

Window functions transform how you write analytical queries in SQLite. Unlike aggregate functions that collapse multiple rows into a single result, window functions calculate values across a set of...

Key Insights

  • Window functions perform calculations across rows without collapsing results like GROUP BY does, making them essential for rankings, running totals, and row-to-row comparisons in a single query
  • SQLite has supported window functions since version 3.25.0 (2018), providing all standard SQL window functions including ROW_NUMBER(), LAG(), LEAD(), and aggregate functions with OVER() clauses
  • Proper use of PARTITION BY and frame specifications (ROWS BETWEEN) allows you to control exactly which rows participate in each calculation, enabling complex analytics that would otherwise require self-joins or subqueries

Introduction to Window Functions

Window functions transform how you write analytical queries in SQLite. Unlike aggregate functions that collapse multiple rows into a single result, window functions calculate values across a set of rows while preserving each individual row in your result set.

Consider calculating a running total of sales. With traditional GROUP BY, you’d need multiple queries or a self-join. Window functions solve this elegantly:

-- Traditional approach: requires self-join or subquery
SELECT 
    date,
    amount,
    (SELECT SUM(amount) 
     FROM sales s2 
     WHERE s2.date <= s1.date) AS running_total
FROM sales s1;

-- Window function approach: clean and efficient
SELECT 
    date,
    amount,
    SUM(amount) OVER (ORDER BY date) AS running_total
FROM sales;

The window function version is not only more readable but also typically more efficient. SQLite processes the window function in a single pass rather than executing a correlated subquery for each row.

Basic Window Function Syntax

Every window function uses the OVER() clause to define its “window” - the set of rows used for calculation. The basic syntax includes three optional components:

function_name() OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression]
    [frame_specification]
)

Here’s how each component works:

-- Basic window: operates on entire result set
SELECT 
    product_name,
    price,
    AVG(price) OVER () AS avg_price
FROM products;

-- PARTITION BY: separate windows per category
SELECT 
    category,
    product_name,
    price,
    AVG(price) OVER (PARTITION BY category) AS category_avg_price
FROM products;

-- ORDER BY: creates ordered frame for calculations
SELECT 
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) AS cumulative_total
FROM orders;

Frame specifications control which rows within the window participate in the calculation:

-- Default frame: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
SELECT 
    date,
    sales,
    SUM(sales) OVER (
        ORDER BY date 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM daily_sales;

-- Moving average: last 7 days including current
SELECT 
    date,
    sales,
    AVG(sales) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day
FROM daily_sales;

Ranking Functions

Ranking functions assign positions to rows based on ordering criteria. SQLite provides four ranking functions, each with different behavior for ties:

CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    name TEXT,
    department TEXT,
    salary INTEGER
);

-- ROW_NUMBER: unique sequential number (arbitrary order for ties)
-- RANK: same rank for ties, gaps in sequence
-- DENSE_RANK: same rank for ties, no gaps
-- NTILE: divides rows into N buckets

SELECT 
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num,
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank,
    NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS quartile
FROM employees;

A practical use case is finding the top N items per category:

-- Top 3 highest-paid employees per department
WITH ranked_employees AS (
    SELECT 
        name,
        department,
        salary,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
)
SELECT name, department, salary
FROM ranked_employees
WHERE rn <= 3;

NTILE() is particularly useful for percentile analysis:

-- Categorize products into price quartiles
SELECT 
    product_name,
    price,
    NTILE(4) OVER (ORDER BY price) AS price_quartile,
    CASE NTILE(4) OVER (ORDER BY price)
        WHEN 1 THEN 'Budget'
        WHEN 2 THEN 'Economy'
        WHEN 3 THEN 'Premium'
        WHEN 4 THEN 'Luxury'
    END AS price_category
FROM products;

Aggregate Window Functions

Standard aggregate functions (SUM, AVG, COUNT, MIN, MAX) become window functions when you add an OVER() clause. This enables running calculations and moving windows:

-- Running total with cumulative count
SELECT 
    order_date,
    order_id,
    amount,
    SUM(amount) OVER (ORDER BY order_date, order_id) AS running_total,
    COUNT(*) OVER (ORDER BY order_date, order_id) AS order_count,
    AVG(amount) OVER (ORDER BY order_date, order_id) AS running_avg
FROM orders;

Moving windows are essential for time-series analysis:

-- 7-day moving average and 30-day moving sum
SELECT 
    date,
    daily_revenue,
    AVG(daily_revenue) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7day,
    SUM(daily_revenue) OVER (
        ORDER BY date 
        ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
    ) AS moving_sum_30day
FROM daily_revenue
ORDER BY date;

You can combine PARTITION BY with frame specifications for grouped moving calculations:

-- Moving average per product category
SELECT 
    date,
    category,
    sales,
    AVG(sales) OVER (
        PARTITION BY category 
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS category_moving_avg
FROM product_sales;

Value Functions

Value functions access data from other rows relative to the current row. LAG() and LEAD() are the most commonly used:

-- Month-over-month comparison
SELECT 
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS previous_month,
    revenue - LAG(revenue) OVER (ORDER BY month) AS mom_change,
    ROUND(100.0 * (revenue - LAG(revenue) OVER (ORDER BY month)) / 
          LAG(revenue) OVER (ORDER BY month), 2) AS mom_change_pct
FROM monthly_revenue;

LEAD() looks forward instead of backward:

-- Compare current price with next price change
SELECT 
    product_id,
    effective_date,
    price,
    LEAD(price) OVER (PARTITION BY product_id ORDER BY effective_date) AS next_price,
    LEAD(effective_date) OVER (PARTITION BY product_id ORDER BY effective_date) AS next_change_date
FROM price_history;

FIRST_VALUE() and LAST_VALUE() extract boundary values from windows:

-- Compare each sale to first and last sale of the day
SELECT 
    sale_time,
    amount,
    FIRST_VALUE(amount) OVER (
        PARTITION BY DATE(sale_time) 
        ORDER BY sale_time
    ) AS first_sale_of_day,
    LAST_VALUE(amount) OVER (
        PARTITION BY DATE(sale_time) 
        ORDER BY sale_time
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS last_sale_of_day
FROM sales;

Note the frame specification for LAST_VALUE() - without it, the default frame only extends to the current row, not the actual last row.

Practical Use Cases

Window functions shine in real-world analytics. Here’s a comprehensive sales dashboard query:

WITH sales_analytics AS (
    SELECT 
        product_id,
        product_name,
        category,
        sale_date,
        quantity,
        revenue,
        -- Rankings
        RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS category_rank,
        -- Running totals
        SUM(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS cumulative_revenue,
        -- Period comparisons
        LAG(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS prev_period_revenue,
        -- Moving averages
        AVG(revenue) OVER (
            PARTITION BY product_id 
            ORDER BY sale_date 
            ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
        ) AS moving_avg_7periods,
        -- Percentage of category total
        100.0 * revenue / SUM(revenue) OVER (PARTITION BY category) AS pct_of_category
    FROM sales
)
SELECT * FROM sales_analytics
WHERE category_rank <= 10
ORDER BY category, category_rank;

Deduplication is another practical application:

-- Keep only the most recent record per customer
DELETE FROM customer_data
WHERE rowid NOT IN (
    SELECT rowid
    FROM (
        SELECT 
            rowid,
            ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY updated_at DESC) AS rn
        FROM customer_data
    )
    WHERE rn = 1
);

Performance Considerations

Window functions are efficient, but understanding their performance characteristics helps you write better queries:

  1. Indexing matters: Create indexes on columns used in PARTITION BY and ORDER BY clauses:
-- This query benefits from an index
CREATE INDEX idx_sales_category_date ON sales(category, sale_date);

SELECT 
    category,
    sale_date,
    revenue,
    SUM(revenue) OVER (PARTITION BY category ORDER BY sale_date) AS running_total
FROM sales;
  1. Avoid redundant window definitions: Use the WINDOW clause to define windows once:
-- Inefficient: repeating window definition
SELECT 
    product_id,
    SUM(revenue) OVER (PARTITION BY category ORDER BY sale_date),
    AVG(revenue) OVER (PARTITION BY category ORDER BY sale_date)
FROM sales;

-- Better: define window once
SELECT 
    product_id,
    SUM(revenue) OVER w,
    AVG(revenue) OVER w
FROM sales
WINDOW w AS (PARTITION BY category ORDER BY sale_date);
  1. Consider materialization for complex windows: If you’re repeatedly querying the same window calculations, materialize them:
CREATE TABLE sales_with_analytics AS
SELECT 
    *,
    SUM(revenue) OVER (PARTITION BY product_id ORDER BY sale_date) AS cumulative_revenue,
    RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS category_rank
FROM sales;

CREATE INDEX idx_analytics ON sales_with_analytics(category, category_rank);

Window functions are a powerful tool in SQLite’s analytical arsenal. They eliminate complex self-joins, enable elegant time-series analysis, and make ranking queries straightforward. Master the syntax, understand frame specifications, and you’ll write cleaner, more efficient SQL.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.