SQL - FIRST_VALUE() and LAST_VALUE()

The basic syntax:

Key Insights

  • FIRST_VALUE() and LAST_VALUE() are window functions that retrieve the first or last value in an ordered partition, essential for comparative analysis and trend detection in SQL queries
  • These functions require explicit frame clauses to avoid unexpected results—LAST_VALUE() defaults to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, not the entire partition
  • Practical applications include tracking price changes, identifying initial states, calculating deltas from baseline values, and detecting data anomalies across time-series data

Understanding Window Function Basics

FIRST_VALUE() and LAST_VALUE() belong to SQL’s window function family. Unlike aggregate functions that collapse rows, window functions perform calculations across row sets while preserving individual rows. These functions access values from other rows in the partition without requiring self-joins.

The basic syntax:

FIRST_VALUE(expression) OVER (
    [PARTITION BY partition_expression]
    ORDER BY sort_expression
    [frame_clause]
)

LAST_VALUE(expression) OVER (
    [PARTITION BY partition_expression]
    ORDER BY sort_expression
    [frame_clause]
)

The Frame Clause Trap

The most common mistake with LAST_VALUE() is omitting the frame clause. By default, window functions use RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, meaning LAST_VALUE() returns the current row’s value—rarely what you want.

CREATE TABLE stock_prices (
    symbol VARCHAR(10),
    price_date DATE,
    closing_price DECIMAL(10,2)
);

INSERT INTO stock_prices VALUES
('AAPL', '2024-01-01', 150.00),
('AAPL', '2024-01-02', 152.50),
('AAPL', '2024-01-03', 148.75),
('AAPL', '2024-01-04', 155.00);

-- Wrong: Returns current row
SELECT 
    symbol,
    price_date,
    closing_price,
    LAST_VALUE(closing_price) OVER (
        PARTITION BY symbol 
        ORDER BY price_date
    ) as last_price_wrong
FROM stock_prices;

-- Correct: Explicit frame specification
SELECT 
    symbol,
    price_date,
    closing_price,
    LAST_VALUE(closing_price) OVER (
        PARTITION BY symbol 
        ORDER BY price_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) as last_price_correct
FROM stock_prices;

Tracking Price Changes

Calculate how each price compares to the first and last prices in a period:

SELECT 
    symbol,
    price_date,
    closing_price,
    FIRST_VALUE(closing_price) OVER w as opening_price,
    LAST_VALUE(closing_price) OVER w as final_price,
    closing_price - FIRST_VALUE(closing_price) OVER w as change_from_open,
    LAST_VALUE(closing_price) OVER w - closing_price as distance_to_close,
    ROUND(
        ((closing_price - FIRST_VALUE(closing_price) OVER w) / 
         FIRST_VALUE(closing_price) OVER w * 100), 2
    ) as pct_change_from_open
FROM stock_prices
WINDOW w AS (
    PARTITION BY symbol 
    ORDER BY price_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY symbol, price_date;

The WINDOW clause defines the window specification once, improving readability and maintenance.

Customer Journey Analysis

Track first and last interactions for customer behavior analysis:

CREATE TABLE customer_interactions (
    customer_id INT,
    interaction_date TIMESTAMP,
    interaction_type VARCHAR(50),
    revenue DECIMAL(10,2)
);

INSERT INTO customer_interactions VALUES
(101, '2024-01-05 10:00:00', 'website_visit', 0),
(101, '2024-01-06 14:30:00', 'email_open', 0),
(101, '2024-01-08 09:15:00', 'purchase', 299.99),
(101, '2024-01-15 16:45:00', 'support_call', 0),
(102, '2024-01-07 11:20:00', 'website_visit', 0),
(102, '2024-01-10 13:00:00', 'purchase', 149.50);

SELECT 
    customer_id,
    interaction_date,
    interaction_type,
    revenue,
    FIRST_VALUE(interaction_type) OVER w as first_touchpoint,
    FIRST_VALUE(interaction_date) OVER w as first_contact,
    LAST_VALUE(interaction_type) OVER w as last_touchpoint,
    LAST_VALUE(interaction_date) OVER w as last_contact,
    interaction_date - FIRST_VALUE(interaction_date) OVER w as days_since_first
FROM customer_interactions
WINDOW w AS (
    PARTITION BY customer_id 
    ORDER BY interaction_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY customer_id, interaction_date;

Identifying Data Quality Issues

Detect records where values deviate significantly from baseline measurements:

CREATE TABLE sensor_readings (
    sensor_id INT,
    reading_time TIMESTAMP,
    temperature DECIMAL(5,2),
    humidity DECIMAL(5,2)
);

INSERT INTO sensor_readings VALUES
(1, '2024-01-01 08:00:00', 22.5, 45.0),
(1, '2024-01-01 09:00:00', 23.1, 46.5),
(1, '2024-01-01 10:00:00', 35.8, 47.2),  -- Anomaly
(1, '2024-01-01 11:00:00', 23.8, 48.0);

SELECT 
    sensor_id,
    reading_time,
    temperature,
    FIRST_VALUE(temperature) OVER w as baseline_temp,
    ABS(temperature - FIRST_VALUE(temperature) OVER w) as temp_deviation,
    CASE 
        WHEN ABS(temperature - FIRST_VALUE(temperature) OVER w) > 10 
        THEN 'ALERT'
        ELSE 'NORMAL'
    END as status
FROM sensor_readings
WINDOW w AS (
    PARTITION BY sensor_id 
    ORDER BY reading_time
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
ORDER BY sensor_id, reading_time;

Monthly Cohort Analysis

Compare performance metrics against the first month’s baseline:

CREATE TABLE monthly_revenue (
    cohort_month DATE,
    customer_segment VARCHAR(50),
    revenue DECIMAL(12,2)
);

INSERT INTO monthly_revenue VALUES
('2024-01-01', 'Enterprise', 500000),
('2024-02-01', 'Enterprise', 525000),
('2024-03-01', 'Enterprise', 510000),
('2024-01-01', 'SMB', 150000),
('2024-02-01', 'SMB', 165000),
('2024-03-01', 'SMB', 180000);

SELECT 
    cohort_month,
    customer_segment,
    revenue,
    FIRST_VALUE(revenue) OVER w as baseline_revenue,
    revenue - FIRST_VALUE(revenue) OVER w as revenue_growth,
    ROUND(
        ((revenue - FIRST_VALUE(revenue) OVER w) / 
         FIRST_VALUE(revenue) OVER w * 100), 2
    ) as growth_percentage,
    LAST_VALUE(revenue) OVER w as projected_final
FROM monthly_revenue
WINDOW w AS (
    PARTITION BY customer_segment 
    ORDER BY cohort_month
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY customer_segment, cohort_month;

Performance Considerations

Window functions execute after the WHERE clause but before ORDER BY. For large datasets, consider these optimizations:

-- Use indexed columns in PARTITION BY and ORDER BY
CREATE INDEX idx_stock_symbol_date ON stock_prices(symbol, price_date);

-- Filter before windowing when possible
WITH filtered_data AS (
    SELECT * 
    FROM stock_prices 
    WHERE price_date >= '2024-01-01'
)
SELECT 
    symbol,
    price_date,
    closing_price,
    FIRST_VALUE(closing_price) OVER (
        PARTITION BY symbol 
        ORDER BY price_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) as first_price
FROM filtered_data;

Handling NULL Values

Both functions respect NULL values but handle them based on position:

CREATE TABLE incomplete_data (
    id INT,
    value DECIMAL(10,2)
);

INSERT INTO incomplete_data VALUES
(1, NULL),
(2, 100.00),
(3, 150.00),
(4, NULL);

SELECT 
    id,
    value,
    FIRST_VALUE(value) OVER (ORDER BY id) as first_val,
    FIRST_VALUE(value IGNORE NULLS) OVER (ORDER BY id) as first_non_null,
    LAST_VALUE(value) OVER (
        ORDER BY id 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) as last_val
FROM incomplete_data;

The IGNORE NULLS clause (supported in PostgreSQL 11+, Oracle, SQL Server 2022+) skips NULL values when finding the first or last value.

Combining with Other Window Functions

Build comprehensive analytical queries by combining multiple window functions:

SELECT 
    symbol,
    price_date,
    closing_price,
    FIRST_VALUE(closing_price) OVER w as period_first,
    LAST_VALUE(closing_price) OVER w as period_last,
    AVG(closing_price) OVER w as period_avg,
    closing_price - LAG(closing_price) OVER (
        PARTITION BY symbol ORDER BY price_date
    ) as daily_change,
    RANK() OVER (
        PARTITION BY symbol ORDER BY closing_price DESC
    ) as price_rank
FROM stock_prices
WINDOW w AS (
    PARTITION BY symbol 
    ORDER BY price_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY symbol, price_date;

These window functions eliminate complex self-joins and subqueries while maintaining query clarity. Master the frame clause specification, understand default behaviors, and leverage these functions for baseline comparisons, trend analysis, and data quality monitoring in production systems.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.