How to Use Transactions in PostgreSQL

Transactions are the foundation of data integrity in PostgreSQL. They guarantee that a series of operations either complete entirely or leave no trace, preventing the nightmare scenario where your...

Key Insights

  • Transactions ensure data integrity through ACID properties, preventing partial updates that could corrupt your database state—always wrap related operations in a single transaction block.
  • PostgreSQL defaults to READ COMMITTED isolation, but you need REPEATABLE READ or SERIALIZABLE for scenarios requiring consistent snapshots across multiple queries within a transaction.
  • Keep transactions short and focused; long-running transactions block concurrent operations, increase deadlock risk, and can cause table bloat from MVCC overhead.

Introduction to Database Transactions

Transactions are the foundation of data integrity in PostgreSQL. They guarantee that a series of operations either complete entirely or leave no trace, preventing the nightmare scenario where your database ends up in an inconsistent state.

The ACID properties define what makes a transaction reliable:

  • Atomicity: All operations succeed or all fail—no partial updates
  • Consistency: Database moves from one valid state to another
  • Isolation: Concurrent transactions don’t interfere with each other
  • Durability: Committed changes survive system crashes

Consider a bank transfer between accounts. Without transactions, you might deduct money from one account but fail to add it to another due to a network error or constraint violation. The money vanishes. Transactions prevent this.

Here’s what happens without transaction protection:

-- Dangerous: No transaction wrapper
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
-- What if the system crashes here?
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;

If anything goes wrong between these statements, you’ve corrupted your data. Transactions solve this fundamental problem.

Basic Transaction Syntax

PostgreSQL transactions use three core commands: BEGIN, COMMIT, and ROLLBACK. Every transaction starts with BEGIN, and you either COMMIT to make changes permanent or ROLLBACK to discard them.

Here’s the bank transfer done correctly:

BEGIN;

UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;

COMMIT;

Now both updates happen atomically. If either fails, neither takes effect.

You can explicitly abort a transaction with ROLLBACK:

BEGIN;

INSERT INTO orders (user_id, total) VALUES (123, 99.99);
INSERT INTO order_items (order_id, product_id, quantity) 
VALUES (currval('orders_id_seq'), 456, 2);

-- Changed your mind? Undo everything
ROLLBACK;

After ROLLBACK, it’s as if the transaction never happened. The database state is unchanged.

A successful multi-step operation:

BEGIN;

-- Create user
INSERT INTO users (email, name) 
VALUES ('user@example.com', 'John Doe')
RETURNING id;

-- Create associated profile (assuming user_id = 1 from above)
INSERT INTO user_profiles (user_id, bio, avatar_url)
VALUES (1, 'Software developer', '/avatars/default.png');

-- Create initial settings
INSERT INTO user_settings (user_id, theme, notifications_enabled)
VALUES (1, 'dark', true);

COMMIT;

All three inserts succeed together or fail together. No orphaned records.

Savepoints and Partial Rollbacks

Savepoints let you create checkpoints within a transaction, enabling partial rollbacks without abandoning all your work. This is invaluable for complex, multi-step processes.

BEGIN;

-- Step 1: Create the order
INSERT INTO orders (user_id, status, total)
VALUES (123, 'pending', 0)
RETURNING id; -- Assume this returns 501

SAVEPOINT order_created;

-- Step 2: Add items
INSERT INTO order_items (order_id, product_id, quantity, price)
VALUES (501, 1001, 2, 29.99);

INSERT INTO order_items (order_id, product_id, quantity, price)
VALUES (501, 1002, 1, 49.99);

SAVEPOINT items_added;

-- Step 3: Apply discount code (might fail)
UPDATE orders 
SET discount_code = 'SUMMER2024', total = total - 10
WHERE id = 501;

-- Discount code was invalid, roll back just the discount
ROLLBACK TO SAVEPOINT items_added;

-- Step 4: Calculate final total
UPDATE orders 
SET total = (SELECT SUM(quantity * price) FROM order_items WHERE order_id = 501)
WHERE id = 501;

COMMIT;

The order and items persist, but the failed discount application gets rolled back. You can also release savepoints when you’re confident about a checkpoint:

SAVEPOINT critical_section;
-- ... operations ...
RELEASE SAVEPOINT critical_section; -- Can't roll back past here anymore

Transaction Isolation Levels

Isolation levels control how transactions interact with concurrent operations. PostgreSQL supports four levels, each trading consistency for performance.

READ COMMITTED (PostgreSQL default): Each query sees data committed before it began. This prevents dirty reads but allows non-repeatable reads.

-- Session 1
BEGIN;
SELECT balance FROM accounts WHERE account_id = 1; -- Returns 1000

-- Session 2 (meanwhile)
UPDATE accounts SET balance = 500 WHERE account_id = 1;
COMMIT;

-- Session 1 (continuing)
SELECT balance FROM accounts WHERE account_id = 1; -- Returns 500 (different!)
COMMIT;

The same query in Session 1 returns different results because Session 2 committed between reads.

REPEATABLE READ: Queries see a snapshot from when the transaction started. Prevents non-repeatable reads but can allow phantom reads.

-- Session 1
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE account_id = 1; -- Returns 1000

-- Session 2
UPDATE accounts SET balance = 500 WHERE account_id = 1;
COMMIT;

-- Session 1
SELECT balance FROM accounts WHERE account_id = 1; -- Still returns 1000
COMMIT;

Session 1 sees a consistent snapshot throughout the transaction.

SERIALIZABLE: The strictest level. Transactions execute as if they ran serially, one after another. PostgreSQL uses predicate locking to detect conflicts.

-- Session 1
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT SUM(balance) FROM accounts WHERE user_id = 123; -- Returns 5000

-- Session 2
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO accounts (user_id, balance) VALUES (123, 1000);
COMMIT;

-- Session 1
SELECT SUM(balance) FROM accounts WHERE user_id = 123; -- Still 5000
-- Try to update based on the sum
UPDATE summary SET total = 5000 WHERE user_id = 123;
COMMIT; -- ERROR: could not serialize access due to read/write dependencies

PostgreSQL aborts one transaction to maintain serializability. Your application must retry.

READ UNCOMMITTED: PostgreSQL treats this as READ COMMITTED. Dirty reads aren’t possible in PostgreSQL.

Handling Transaction Errors

Transaction errors require careful handling, especially serialization failures and deadlocks. Your application code must detect and retry these.

-- This will fail due to serialization conflict
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... operations that conflict with another transaction ...
COMMIT; -- ERROR: could not serialize access

In Python with psycopg2, implement retry logic:

import psycopg2
from psycopg2 import errorcodes
import time

def transfer_with_retry(conn, from_account, to_account, amount, max_retries=3):
    for attempt in range(max_retries):
        try:
            with conn:  # Automatic BEGIN/COMMIT/ROLLBACK
                with conn.cursor() as cur:
                    cur.execute(
                        "UPDATE accounts SET balance = balance - %s WHERE id = %s",
                        (amount, from_account)
                    )
                    cur.execute(
                        "UPDATE accounts SET balance = balance + %s WHERE id = %s",
                        (amount, to_account)
                    )
            return True  # Success
            
        except psycopg2.Error as e:
            if e.pgcode == errorcodes.SERIALIZATION_FAILURE:
                if attempt < max_retries - 1:
                    time.sleep(0.1 * (2 ** attempt))  # Exponential backoff
                    continue
            raise  # Re-raise if not serialization error or out of retries
    
    return False

This handles the common case where SERIALIZABLE transactions conflict. The exponential backoff reduces contention.

Constraint violations need different handling:

try:
    with conn:
        with conn.cursor() as cur:
            cur.execute(
                "INSERT INTO users (email, name) VALUES (%s, %s)",
                ('user@example.com', 'John Doe')
            )
except psycopg2.IntegrityError as e:
    if e.pgcode == errorcodes.UNIQUE_VIOLATION:
        print("User already exists")
    else:
        raise

Best Practices and Common Pitfalls

Keep transactions short. Long-running transactions hold locks, block VACUUM, and cause table bloat. Each transaction creates a new version of modified rows; old versions can’t be cleaned until all transactions that might see them complete.

Anti-pattern: Waiting for user input inside a transaction

# DON'T DO THIS
conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("BEGIN")
cur.execute("SELECT * FROM products WHERE id = %s FOR UPDATE", (product_id,))
user_input = input("Enter quantity: ")  # Transaction open during user wait!
cur.execute("UPDATE products SET stock = stock - %s WHERE id = %s", 
            (user_input, product_id))
conn.commit()

This holds locks while waiting for user input—potentially minutes. Other transactions block.

Correct pattern: Gather data first, then transact

# DO THIS
conn = psycopg2.connect(...)
user_input = input("Enter quantity: ")  # Get input first

with conn:  # Transaction starts here
    with conn.cursor() as cur:
        cur.execute("UPDATE products SET stock = stock - %s WHERE id = %s", 
                    (user_input, product_id))

Avoid mixing transaction control levels. Don’t manually BEGIN when using connection context managers—they handle it automatically.

Choose appropriate isolation levels. Use READ COMMITTED for most workloads. Reserve SERIALIZABLE for scenarios requiring absolute consistency, and implement retry logic.

Batch operations within transactions. Instead of committing after each insert, batch them:

BEGIN;
INSERT INTO logs (message) VALUES ('Entry 1');
INSERT INTO logs (message) VALUES ('Entry 2');
-- ... hundreds more ...
COMMIT;

This is orders of magnitude faster than individual commits.

Monitor transaction age. Query pg_stat_activity to find long-running transactions:

SELECT pid, usename, state, now() - xact_start AS duration, query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY xact_start;

Kill problematic transactions if needed: SELECT pg_terminate_backend(pid);

Transactions are non-negotiable for data integrity. Master these patterns, understand isolation levels, and keep transactions focused. Your database will reward you with consistent data and better performance.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.