How to Use EXPLAIN ANALYZE in PostgreSQL

PostgreSQL's query planner makes thousands of decisions per second about how to execute your queries. When performance degrades, you need visibility into those decisions. That's where EXPLAIN and...

Key Insights

  • EXPLAIN shows PostgreSQL’s query plan, while EXPLAIN ANALYZE actually executes the query and reports real performance metrics—use EXPLAIN for exploration, ANALYZE for accurate diagnosis
  • The “actual time” and “rows” metrics are more valuable than cost estimates; large discrepancies between estimated and actual rows signal stale statistics or poor query design
  • Adding BUFFERS to EXPLAIN ANALYZE reveals I/O patterns that often matter more than CPU time—queries hitting disk instead of cache are your primary optimization targets

Understanding EXPLAIN vs EXPLAIN ANALYZE

PostgreSQL’s query planner makes thousands of decisions per second about how to execute your queries. When performance degrades, you need visibility into those decisions. That’s where EXPLAIN and EXPLAIN ANALYZE come in.

EXPLAIN shows you the query plan without executing anything. It’s fast and safe for production, but only shows estimates. EXPLAIN ANALYZE actually runs the query and measures real performance, giving you accurate timing and row counts at the cost of executing the full query.

-- EXPLAIN only (no execution)
EXPLAIN 
SELECT * FROM orders WHERE customer_id = 12345;

-- EXPLAIN ANALYZE (actually runs the query)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 12345;

The EXPLAIN-only output might show:

Seq Scan on orders  (cost=0.00..1808.00 rows=50 width=120)
  Filter: (customer_id = 12345)

While EXPLAIN ANALYZE adds reality:

Seq Scan on orders  (cost=0.00..1808.00 rows=50 width=120) 
                    (actual time=0.045..23.891 rows=47 loops=1)
  Filter: (customer_id = 12345)
  Rows Removed by Filter: 99953
Planning Time: 0.156 ms
Execution Time: 23.923 ms

Notice the actual execution found 47 rows (close to the 50 estimate), but took 23.9ms and scanned all 100,000 rows in the table.

Reading the Execution Plan

EXPLAIN ANALYZE output follows a tree structure, indented to show operation hierarchy. Read from the most indented operations outward—those are executed first.

Key metrics to understand:

  • cost: Arbitrary units estimating resource usage (startup cost..total cost)
  • rows: Estimated number of rows the planner expects
  • width: Average row size in bytes
  • actual time: Real milliseconds (first row..last row)
  • rows (in actual section): Actual rows returned
  • loops: How many times this operation ran
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255),
    created_at TIMESTAMP
);

INSERT INTO users (email, created_at)
SELECT 
    'user' || i || '@example.com',
    NOW() - (random() * INTERVAL '365 days')
FROM generate_series(1, 100000) i;

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user50000@example.com';

Output breakdown:

Seq Scan on users  (cost=0.00..2137.00 rows=1 width=45) 
                   (actual time=10.234..18.567 rows=1 loops=1)
  Filter: (email = 'user50000@example.com'::text)
  Rows Removed by Filter: 99999
Planning Time: 0.123 ms
Execution Time: 18.589 ms

This sequential scan read all 100,000 rows (Rows Removed by Filter) to find one match. The execution took 18.6ms—acceptable for an occasional query, but problematic if this runs frequently.

Sequential Scans vs Index Scans

PostgreSQL chooses between scanning the entire table or using an index based on cost estimates. Understanding this choice is critical for optimization.

-- Create an index
CREATE INDEX idx_users_email ON users(email);

-- Rerun the same query
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user50000@example.com';

Now the plan changes:

Index Scan using idx_users_email on users  
    (cost=0.29..8.31 rows=1 width=45) 
    (actual time=0.034..0.035 rows=1 loops=1)
  Index Cond: (email = 'user50000@example.com'::text)
Planning Time: 0.289 ms
Execution Time: 0.056 ms

Execution time dropped from 18.6ms to 0.056ms—a 330x improvement. The index scan found the target row directly without scanning the full table.

Join Performance Analysis

Joins reveal the most interesting execution plan variations. PostgreSQL uses three primary join strategies:

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER,
    total DECIMAL(10,2),
    created_at TIMESTAMP
);

INSERT INTO orders (user_id, total, created_at)
SELECT 
    (random() * 100000)::INTEGER,
    (random() * 1000)::DECIMAL(10,2),
    NOW() - (random() * INTERVAL '365 days')
FROM generate_series(1, 500000);

CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Nested Loop Join (small result sets)
EXPLAIN ANALYZE
SELECT u.email, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.id = 50000;

Output shows a nested loop:

Nested Loop  (cost=4.86..45.23 rows=5 width=51) 
             (actual time=0.067..0.123 rows=4 loops=1)
  ->  Index Scan using users_pkey on users u  
      (cost=0.29..8.31 rows=1 width=45)
      (actual time=0.023..0.024 rows=1 loops=1)
        Index Cond: (id = 50000)
  ->  Bitmap Heap Scan on orders o  
      (cost=4.57..36.87 rows=5 width=14)
      (actual time=0.038..0.091 rows=4 loops=1)
        Recheck Cond: (user_id = 50000)
        ->  Bitmap Index Scan on idx_orders_user_id  
            (cost=0.00..4.57 rows=5 width=0)
            (actual time=0.029..0.029 rows=4 loops=1)
              Index Cond: (user_id = 50000)

For larger joins, PostgreSQL might choose a hash join:

EXPLAIN ANALYZE
SELECT u.email, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON o.user_id = u.id
GROUP BY u.email;

This produces a hash join plan, which builds a hash table from one side and probes it with the other—efficient for large datasets.

Spotting Performance Problems

Large discrepancies between estimated and actual rows indicate stale statistics:

-- Simulate stale statistics
DELETE FROM orders WHERE user_id < 10000;

EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id < 10000;

If the planner still estimates thousands of rows but actual rows is zero, run:

ANALYZE orders;

This updates table statistics, allowing the planner to make better decisions.

Another red flag is high “Rows Removed by Filter” counts:

EXPLAIN ANALYZE
SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '7 days';

If you see thousands of rows removed by filter, you need an index on created_at:

CREATE INDEX idx_orders_created_at ON orders(created_at);

Advanced EXPLAIN Options

The BUFFERS option reveals I/O patterns—often more important than CPU time:

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 50000;

Output includes:

Buffers: shared hit=8 read=2
  • shared hit: Pages found in PostgreSQL’s shared buffer cache (fast)
  • read: Pages read from disk (slow)

A high read count indicates cache misses. If this query runs frequently, you’re hitting disk repeatedly—a primary optimization target.

The VERBOSE option shows which columns are being retrieved:

EXPLAIN (ANALYZE, VERBOSE)
SELECT email FROM users WHERE id = 50000;

For programmatic analysis, use FORMAT JSON:

EXPLAIN (ANALYZE, FORMAT JSON)
SELECT * FROM users WHERE id = 50000;

This produces machine-readable output perfect for monitoring tools or automated analysis.

Production-Safe Analysis

Never run EXPLAIN ANALYZE on DELETE, UPDATE, or INSERT statements in production without wrapping in a transaction:

BEGIN;
EXPLAIN ANALYZE
UPDATE orders SET total = total * 1.1 WHERE user_id = 50000;
ROLLBACK;

The ROLLBACK prevents actual data changes while still measuring performance.

For read queries, EXPLAIN ANALYZE is safe but still executes the full query. On large datasets, use plain EXPLAIN first:

-- Safe exploration
EXPLAIN SELECT * FROM huge_table WHERE complex_condition;

-- Only if EXPLAIN looks reasonable
EXPLAIN ANALYZE SELECT * FROM huge_table WHERE complex_condition;

Maintaining Query Performance

Regular statistics updates keep the planner accurate:

-- Update statistics for specific table
ANALYZE orders;

-- Update all tables (run during maintenance windows)
ANALYZE;

For broader query analysis, enable pg_stat_statements:

CREATE EXTENSION pg_stat_statements;

-- Find slowest queries
SELECT 
    query,
    calls,
    mean_exec_time,
    max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

This identifies which queries deserve EXPLAIN ANALYZE investigation.

Conclusion

EXPLAIN ANALYZE transforms query optimization from guesswork into data-driven decisions. Start with EXPLAIN for safe exploration, add ANALYZE for accurate measurements, and include BUFFERS when I/O patterns matter. Watch for large actual vs. estimated row discrepancies, high filter removal counts, and sequential scans on large tables. Regular ANALYZE commands keep statistics fresh, while pg_stat_statements identifies problematic queries across your entire application.

Master these tools and you’ll diagnose performance issues in minutes instead of hours.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.