How to Create Indexes in PostgreSQL
Indexes are data structures that PostgreSQL uses to find rows faster without scanning entire tables. Think of them like a book's index—instead of reading every page to find a topic, you jump directly...
Key Insights
- PostgreSQL indexes dramatically speed up queries but come with trade-offs: they slow down writes and consume disk space, so strategic placement matters more than quantity.
- Beyond basic B-tree indexes, PostgreSQL offers specialized types like GIN for JSONB/arrays and BRIN for time-series data—choosing the right type can improve performance by orders of magnitude.
- Use
CREATE INDEX CONCURRENTLYin production to avoid table locks, and regularly monitorpg_stat_user_indexesto identify and remove unused indexes that waste resources.
Introduction to PostgreSQL Indexes
Indexes are data structures that PostgreSQL uses to find rows faster without scanning entire tables. Think of them like a book’s index—instead of reading every page to find a topic, you jump directly to the relevant pages.
Without indexes, PostgreSQL performs sequential scans, reading every row in a table. This works fine for small tables with thousands of rows, but becomes prohibitively slow as tables grow to millions of records. A well-placed index can reduce query time from minutes to milliseconds.
The trade-off is straightforward: indexes accelerate reads but slow down writes. Every INSERT, UPDATE, or DELETE must also update all relevant indexes. Indexes also consume disk space—sometimes as much as the table itself. This means you need to be strategic about which columns to index rather than indexing everything.
Basic Index Creation Syntax
The fundamental syntax for creating an index is simple:
CREATE INDEX index_name ON table_name (column_name);
Here’s a practical example with a users table:
-- Create a sample table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL,
username VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
last_login TIMESTAMP
);
-- Create an index on the email column
CREATE INDEX idx_users_email ON users (email);
Follow a consistent naming convention. I recommend idx_tablename_columnname for basic indexes. This makes it immediately clear what table and column the index covers when you’re reviewing database structure.
PostgreSQL automatically creates indexes for PRIMARY KEY and UNIQUE constraints, so you don’t need to manually index those columns. The id column in our example already has an index.
Index Types in PostgreSQL
PostgreSQL supports six index types, each optimized for different scenarios. Choosing the right type is crucial for optimal performance.
B-tree (Default): Handles comparison operators (<, <=, =, >=, >) and is suitable for most use cases. B-tree indexes maintain sorted data and work well for range queries.
-- B-tree is the default, these are equivalent
CREATE INDEX idx_users_created ON users (created_at);
CREATE INDEX idx_users_created ON users USING btree (created_at);
Hash: Optimized for simple equality comparisons. Rarely used in practice because B-tree indexes handle equality efficiently and offer more flexibility.
GIN (Generalized Inverted Index): Perfect for indexing composite values like arrays, JSONB documents, and full-text search. GIN indexes are larger but excel at searching within complex data types.
-- Create a table with JSONB data
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
attributes JSONB
);
-- GIN index for JSONB queries
CREATE INDEX idx_products_attributes ON products USING gin (attributes);
-- Now queries like this use the index
SELECT * FROM products WHERE attributes @> '{"color": "blue"}';
GiST (Generalized Search Tree): Used for geometric data, full-text search, and nearest-neighbor searches. Smaller than GIN but slower for lookups.
SP-GiST (Space-Partitioned GiST): Optimized for data with non-balanced structures like phone numbers or IP addresses.
BRIN (Block Range Index): Extremely small indexes for massive tables with natural ordering. Perfect for time-series data where rows are physically ordered by timestamp.
-- BRIN index for time-series data
CREATE TABLE sensor_readings (
id BIGSERIAL PRIMARY KEY,
sensor_id INTEGER,
reading NUMERIC,
recorded_at TIMESTAMP
);
-- BRIN index is tiny but effective for time-range queries
CREATE INDEX idx_sensor_readings_time ON sensor_readings USING brin (recorded_at);
BRIN indexes can be hundreds of times smaller than B-tree indexes while still providing significant performance benefits for range queries on naturally ordered data.
Advanced Index Techniques
Multi-column (Composite) Indexes: When queries filter on multiple columns together, composite indexes are essential. Column order matters—place the most selective column first.
-- Composite index for queries filtering by status and created date
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
-- This query uses the index efficiently
SELECT * FROM orders WHERE status = 'pending' AND created_at > '2024-01-01';
-- This query also uses the index (leftmost column)
SELECT * FROM orders WHERE status = 'pending';
-- This query does NOT use the index efficiently (skips leftmost column)
SELECT * FROM orders WHERE created_at > '2024-01-01';
Partial Indexes: Index only a subset of rows using a WHERE clause. This reduces index size and improves performance when you consistently query the same subset.
-- Only index active users
CREATE INDEX idx_users_active_email ON users (email) WHERE status = 'active';
-- Only index recent orders
CREATE INDEX idx_orders_recent ON orders (created_at)
WHERE created_at > '2024-01-01';
Partial indexes are particularly powerful for soft-deleted records or status-based filtering where you rarely query certain states.
Expression Indexes: Index the result of a function or expression rather than raw column values. Essential for case-insensitive searches or computed values.
-- Case-insensitive email search
CREATE INDEX idx_users_email_lower ON users (LOWER(email));
-- Now this query uses the index
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Index on extracted JSON value
CREATE INDEX idx_products_category ON products ((attributes->>'category'));
Unique Indexes: Enforce uniqueness while providing index benefits. Use these instead of separate unique constraints when possible.
-- Ensure unique usernames (case-insensitive)
CREATE UNIQUE INDEX idx_users_username_unique ON users (LOWER(username));
Index Creation Options
Concurrent Index Creation: By default, CREATE INDEX locks the table against writes. In production, use CONCURRENTLY to avoid downtime.
-- Allows reads and writes during index creation
CREATE INDEX CONCURRENTLY idx_users_last_login ON users (last_login);
The trade-off is that concurrent creation takes longer and cannot run inside a transaction block. If it fails, it leaves behind an invalid index that you must manually drop.
Fill Factor: Controls how full index pages become before PostgreSQL starts a new page. Lower values leave room for updates without page splits.
-- 70% fill factor for frequently updated tables
CREATE INDEX idx_users_email ON users (email) WITH (fillfactor = 70);
The default fill factor is 90. Use lower values (70-80) for tables with frequent updates, and keep 90 for read-heavy tables.
Tablespace Specification: Place indexes on faster storage or separate disks for I/O distribution.
-- Create index on SSD tablespace
CREATE INDEX idx_users_email ON users (email) TABLESPACE fast_ssd;
Monitoring and Maintaining Indexes
Creating indexes is only half the battle. You need to monitor their effectiveness and remove unused ones.
View Existing Indexes:
-- List all indexes for a table
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'users';
-- Detailed index information
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
Check Index Usage:
-- Identify unused indexes
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read,
idx_tup_fetch,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelid NOT IN (
SELECT indexrelid FROM pg_index WHERE indisunique
)
ORDER BY pg_relation_size(indexrelid) DESC;
This query finds indexes that have never been scanned (excluding unique indexes which enforce constraints). These are prime candidates for removal.
Verify Index Usage in Queries:
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'test@example.com';
Look for “Index Scan” or “Index Only Scan” in the output. If you see “Seq Scan” on a large table, your index isn’t being used.
Drop Unused Indexes:
DROP INDEX CONCURRENTLY idx_users_unused;
Always use CONCURRENTLY when dropping indexes in production to avoid locking.
Best Practices and Common Pitfalls
Index columns with high selectivity. A column where most values are unique (like email) benefits greatly from indexing. A boolean column with only true/false values rarely benefits from a full index—consider a partial index instead.
Avoid indexing small tables. If a table has fewer than a few thousand rows, sequential scans are often faster than index lookups. PostgreSQL’s query planner will ignore your index anyway.
Don’t over-index. Every index slows down writes and consumes storage. More than 5-6 indexes on a single table is often a red flag. Review your query patterns and consolidate where possible.
Maintain indexes regularly. B-tree indexes become bloated over time with updates and deletes. Schedule regular maintenance:
-- Rebuild a bloated index
REINDEX INDEX CONCURRENTLY idx_users_email;
-- Or rebuild all indexes on a table
REINDEX TABLE CONCURRENTLY users;
Run VACUUM regularly to prevent bloat. Most PostgreSQL installations have autovacuum enabled by default, but heavily updated tables may need manual intervention.
Test before deploying. Create indexes on a staging environment with production-like data volumes. Use EXPLAIN ANALYZE to verify performance improvements. I’ve seen indexes that helped small datasets but hurt performance on production-scale data.
Monitor query performance continuously. Use pg_stat_statements to identify slow queries, then create targeted indexes. Don’t guess—let actual query patterns drive your indexing strategy.
The key to effective indexing is balance. Start with indexes on foreign keys and frequently filtered columns, monitor query performance, and iterate based on real-world usage patterns. A few well-chosen indexes will serve you better than dozens of speculative ones.