SQL: Index Types and When to Use Them
Indexes are data structures that allow your database to find rows without scanning entire tables. Think of them like a book's index—instead of reading every page to find mentions of 'B-tree,' you...
Key Insights
- B-tree indexes speed up queries but slow down writes—index columns used in WHERE, JOIN, and ORDER BY clauses, but avoid over-indexing small tables or columns with low cardinality
- Composite indexes follow the leftmost prefix rule: an index on (last_name, first_name, email) works for queries filtering on last_name alone or last_name + first_name, but not first_name alone
- Partial and expression indexes are powerful SQLite features that reduce index size and enable optimization for specific query patterns like case-insensitive searches or filtering soft-deleted records
Introduction to Database Indexes
Indexes are data structures that allow your database to find rows without scanning entire tables. Think of them like a book’s index—instead of reading every page to find mentions of “B-tree,” you check the index and jump directly to the relevant pages.
The fundamental trade-off is simple: indexes make reads faster but writes slower. Every INSERT, UPDATE, or DELETE must update not just the table but also its indexes. This overhead is usually worth it, but not always.
Skip indexes when:
- The table has fewer than a few thousand rows
- You’re indexing columns with very low cardinality (like boolean flags)
- The column is rarely used in queries
- Write performance is critical and reads are infrequent
Here’s how to verify whether an index helps:
-- Without index
EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email = 'user@example.com';
-- Output: SCAN users
-- Create index
CREATE INDEX idx_users_email ON users(email);
-- With index
EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email = 'user@example.com';
-- Output: SEARCH users USING INDEX idx_users_email (email=?)
The difference between SCAN (full table scan) and SEARCH (index lookup) is often the difference between milliseconds and minutes on large tables.
B-Tree Indexes (Default)
SQLite uses B-tree indexes exclusively. A B-tree is a self-balancing tree structure that maintains sorted data and allows searches, insertions, and deletions in logarithmic time. This makes B-trees excellent for:
- Equality searches:
WHERE id = 5 - Range queries:
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31' - Sorting:
ORDER BY last_name - Prefix matching:
WHERE email LIKE 'admin%'(but notWHERE email LIKE '%admin')
Creating a basic index is straightforward:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT NOT NULL,
last_name TEXT,
first_name TEXT,
created_at TEXT
);
-- Single-column index
CREATE INDEX idx_users_created_at ON users(created_at);
-- Composite index (column order matters!)
CREATE INDEX idx_users_name ON users(last_name, first_name);
-- Query that uses the created_at index
SELECT * FROM users
WHERE created_at > '2024-01-01'
ORDER BY created_at;
-- Query that uses the composite name index
SELECT * FROM users
WHERE last_name = 'Smith'
ORDER BY first_name;
The composite index on (last_name, first_name) works because of the leftmost prefix rule—more on that later.
Unique Indexes
Unique indexes serve dual purposes: enforcing data integrity and optimizing queries. When you declare a column UNIQUE, SQLite automatically creates a unique index.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE, -- Automatically creates unique index
username TEXT
);
-- Explicit unique index (equivalent to above for email)
CREATE UNIQUE INDEX idx_users_username ON users(username);
-- Attempting to insert duplicate values fails
INSERT INTO users (email, username) VALUES ('test@example.com', 'john');
INSERT INTO users (email, username) VALUES ('test@example.com', 'jane');
-- Error: UNIQUE constraint failed: users.email
Unique indexes are slightly faster than regular indexes for lookups because SQLite can stop searching once it finds a match—it knows there can’t be another one. Use them liberally for natural keys like email addresses, usernames, and external IDs.
Composite unique indexes enforce uniqueness across multiple columns:
CREATE TABLE user_roles (
user_id INTEGER,
role_id INTEGER,
UNIQUE(user_id, role_id)
);
-- This is fine
INSERT INTO user_roles VALUES (1, 100);
INSERT INTO user_roles VALUES (1, 101);
-- This fails
INSERT INTO user_roles VALUES (1, 100);
-- Error: UNIQUE constraint failed
Partial Indexes
Partial indexes only index rows matching a WHERE condition. This is incredibly useful for reducing index size and maintenance overhead when you only care about a subset of data.
Common use cases:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT,
is_active INTEGER DEFAULT 1,
deleted_at TEXT
);
-- Only index active users
CREATE INDEX idx_active_users_email
ON users(email)
WHERE is_active = 1;
-- Only index non-deleted records (soft deletes)
CREATE INDEX idx_users_not_deleted
ON users(email, last_name)
WHERE deleted_at IS NULL;
-- Only index premium users
CREATE INDEX idx_premium_users
ON users(last_name, first_name)
WHERE subscription_tier = 'premium';
The query optimizer uses partial indexes only when your query’s WHERE clause is compatible with the index’s WHERE clause:
-- Uses idx_active_users_email
SELECT * FROM users
WHERE is_active = 1 AND email = 'test@example.com';
-- Does NOT use idx_active_users_email (missing is_active condition)
SELECT * FROM users
WHERE email = 'test@example.com';
If 90% of your users are active, a partial index on active users is 90% smaller than a full index—that’s less disk space, less memory, and faster updates.
Expression Indexes
Expression indexes store the result of an expression or function call rather than raw column values. This enables optimization for queries that filter or sort on computed values.
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT,
created_at TEXT
);
-- Case-insensitive email searches
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = LOWER('User@Example.com');
-- Extract year from date
CREATE INDEX idx_users_year ON users(strftime('%Y', created_at));
SELECT * FROM users WHERE strftime('%Y', created_at) = '2024';
-- JSON field extraction (SQLite 3.38+)
CREATE TABLE events (
id INTEGER PRIMARY KEY,
data TEXT -- JSON data
);
CREATE INDEX idx_events_user_id ON events(json_extract(data, '$.user_id'));
SELECT * FROM events WHERE json_extract(data, '$.user_id') = 123;
Without expression indexes, these queries would require full table scans even with regular indexes on the base columns. The expression in your query must exactly match the expression in the index definition.
Multi-Column (Composite) Indexes
Composite indexes are the most misunderstood index type. The leftmost prefix rule is critical: an index on (A, B, C) can be used for queries filtering on:
- A
- A and B
- A, B, and C
But NOT for:
- B alone
- C alone
- B and C
Think of it like a phone book sorted by last name, then first name. You can efficiently find “Smith, John” or all “Smiths,” but you can’t efficiently find all “Johns” without scanning the entire book.
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
user_id INTEGER,
status TEXT,
created_at TEXT
);
CREATE INDEX idx_orders_user_status_date
ON orders(user_id, status, created_at);
-- Uses the index (leftmost prefix)
SELECT * FROM orders WHERE user_id = 123;
-- Uses the index (leftmost prefix + second column)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- Uses the index (all columns)
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending'
ORDER BY created_at;
-- Does NOT use the index efficiently (missing leftmost column)
SELECT * FROM orders WHERE status = 'pending';
SQLite can sometimes use index intersection—combining multiple single-column indexes—but it’s less efficient than a well-designed composite index. When in doubt, create the composite index matching your most common query patterns.
Index Maintenance and Best Practices
Indexes aren’t “set and forget.” Regularly analyze their usage and performance:
-- See how SQLite plans to execute a query
EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email = 'test@example.com';
-- Update index statistics (helps query planner)
ANALYZE;
-- Rebuild indexes (rarely needed, but useful after bulk operations)
REINDEX;
-- Check index size
SELECT name, tbl_name
FROM sqlite_master
WHERE type = 'index';
SQLite maintains statistics in sqlite_stat1 and sqlite_stat4 tables. The ANALYZE command updates these statistics, helping the query planner make better decisions.
General index strategy guidelines:
- Start with obvious indexes: Primary keys, foreign keys, and columns in WHERE clauses
- Profile before optimizing: Use EXPLAIN QUERY PLAN on your actual queries
- Consider write patterns: High-write tables need fewer indexes
- Composite over multiple: One index on (A, B) is better than separate indexes on A and B
- Partial indexes for sparse conditions: If you filter on
is_active = 1frequently, use a partial index - Remove unused indexes: Every index has a cost; drop ones that don’t improve queries
The sweet spot is usually 3-5 indexes per table. More than that and you’re probably over-indexing, which slows down writes without meaningful read improvements.
Indexes are your most powerful tool for database performance, but they require thoughtful application. Focus on your actual query patterns, measure the impact, and don’t be afraid to drop indexes that aren’t pulling their weight.