How to Use UPSERT (ON CONFLICT) in PostgreSQL

PostgreSQL's `INSERT...ON CONFLICT` syntax, commonly called UPSERT (a portmanteau of UPDATE and INSERT), solves a fundamental problem in database operations: how to insert a row if it doesn't exist,...

Key Insights

  • PostgreSQL’s INSERT...ON CONFLICT provides atomic UPSERT operations that eliminate race conditions inherent in separate SELECT/INSERT/UPDATE patterns, making it essential for concurrent applications.
  • The EXCLUDED table reference gives you access to the proposed row values during conflict resolution, enabling sophisticated conditional logic like “only update if the new value is greater than the existing one.”
  • Proper indexing on conflict targets isn’t optional—without a unique index or constraint on your conflict columns, PostgreSQL will throw an error and your UPSERT will fail.

Introduction to UPSERT Operations

PostgreSQL’s INSERT...ON CONFLICT syntax, commonly called UPSERT (a portmanteau of UPDATE and INSERT), solves a fundamental problem in database operations: how to insert a row if it doesn’t exist, or update it if it does—all in a single atomic operation.

Before ON CONFLICT was introduced in PostgreSQL 9.5, developers had to use awkward workarounds: separate SELECT-then-INSERT-or-UPDATE logic that created race conditions, or complex CTEs with writeable queries. These approaches were error-prone and performed poorly under concurrent load.

UPSERT shines in several scenarios: maintaining cache tables, implementing idempotent API operations, syncing data from external sources, and managing session state. Any time you need “insert or update” logic, ON CONFLICT should be your default choice. It’s faster than separate queries, eliminates race conditions, and makes your intent explicit.

Basic UPSERT Syntax and Examples

The simplest form of ON CONFLICT just ignores conflicts:

INSERT INTO users (id, email, name)
VALUES (1, 'john@example.com', 'John Doe')
ON CONFLICT (id) DO NOTHING;

This attempts the insert but silently succeeds if a row with id = 1 already exists. It’s useful for “insert if not exists” logic where you don’t care about updating existing rows.

More commonly, you’ll want to update on conflict:

INSERT INTO users (id, email, name, last_login)
VALUES (1, 'john@example.com', 'John Doe', NOW())
ON CONFLICT (id) 
DO UPDATE SET 
    email = EXCLUDED.email,
    last_login = EXCLUDED.last_login;

The EXCLUDED keyword references the row that would have been inserted. This query inserts a new user or updates the email and last_login timestamp if the user already exists. Notice that we don’t update name—you have full control over which columns get updated.

Here’s a more complete example with a users table:

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email TEXT UNIQUE NOT NULL,
    name TEXT NOT NULL,
    login_count INTEGER DEFAULT 0,
    last_login TIMESTAMP
);

INSERT INTO users (email, name, login_count, last_login)
VALUES ('alice@example.com', 'Alice Smith', 1, NOW())
ON CONFLICT (email)
DO UPDATE SET
    login_count = users.login_count + 1,
    last_login = EXCLUDED.last_login;

This increments the login counter each time we process a login event, demonstrating how you can reference both the existing row (users.login_count) and the proposed values (EXCLUDED.last_login).

Working with Conflict Targets

The conflict target—what goes in the parentheses after ON CONFLICT—must be a unique index or constraint. You have three options for specifying it:

Option 1: Column list

ON CONFLICT (email)
ON CONFLICT (store_id, product_id)  -- composite

Option 2: Constraint name

ON CONFLICT ON CONSTRAINT users_email_key

Option 3: Partial index expression

ON CONFLICT (email) WHERE active = true

For composite unique constraints, you must specify all columns:

CREATE TABLE inventory (
    store_id INTEGER NOT NULL,
    product_id INTEGER NOT NULL,
    quantity INTEGER NOT NULL DEFAULT 0,
    last_updated TIMESTAMP DEFAULT NOW(),
    UNIQUE(store_id, product_id)
);

INSERT INTO inventory (store_id, product_id, quantity)
VALUES (5, 1001, 50)
ON CONFLICT (store_id, product_id)
DO UPDATE SET
    quantity = inventory.quantity + EXCLUDED.quantity,
    last_updated = NOW();

This pattern is perfect for inventory systems where you want to add to existing stock levels rather than replace them.

If your table has multiple unique constraints, you can only target one per statement. To handle conflicts on different constraints, you need separate queries or you must choose the most appropriate constraint for your use case.

Advanced UPSERT Techniques

The real power of ON CONFLICT emerges when you add WHERE clauses to conditionally perform updates:

CREATE TABLE analytics (
    metric_name TEXT PRIMARY KEY,
    value BIGINT NOT NULL,
    updated_at TIMESTAMP DEFAULT NOW()
);

INSERT INTO analytics (metric_name, value, updated_at)
VALUES ('max_concurrent_users', 150, NOW())
ON CONFLICT (metric_name)
DO UPDATE SET
    value = EXCLUDED.value,
    updated_at = EXCLUDED.updated_at
WHERE EXCLUDED.value > analytics.value;

This only updates the metric if the new value is higher than the existing one—perfect for tracking high-water marks. If the condition fails, PostgreSQL performs no update at all, which is more efficient than unconditionally updating with the same value.

You can build complex conditional logic by combining EXCLUDED references with existing column values:

CREATE TABLE user_preferences (
    user_id INTEGER PRIMARY KEY,
    theme TEXT,
    notifications_enabled BOOLEAN,
    version INTEGER NOT NULL DEFAULT 1
);

INSERT INTO user_preferences (user_id, theme, notifications_enabled, version)
VALUES (42, 'dark', true, 5)
ON CONFLICT (user_id)
DO UPDATE SET
    theme = EXCLUDED.theme,
    notifications_enabled = EXCLUDED.notifications_enabled,
    version = EXCLUDED.version
WHERE EXCLUDED.version > user_preferences.version;

This implements optimistic locking: only apply the update if the incoming version is newer. This prevents older data from overwriting newer data in distributed systems.

RETURNING Clause with UPSERT

The RETURNING clause lets you retrieve data from the affected row, which is invaluable for confirming what happened:

INSERT INTO user_sessions (user_id, session_token, created_at, last_activity)
VALUES (123, 'abc123xyz', NOW(), NOW())
ON CONFLICT (user_id)
DO UPDATE SET
    last_activity = NOW()
RETURNING 
    user_id,
    session_token,
    created_at,
    (xmax = 0) AS was_inserted;

The xmax = 0 trick distinguishes inserts from updates. When xmax is 0, the row was inserted; otherwise, it was updated. This is a PostgreSQL-specific internal column that tracks transaction visibility.

Here’s a practical session tracking example:

WITH upserted AS (
    INSERT INTO sessions (user_id, ip_address, user_agent, last_seen)
    VALUES ($1, $2, $3, NOW())
    ON CONFLICT (user_id)
    DO UPDATE SET
        last_seen = NOW(),
        ip_address = EXCLUDED.ip_address
    RETURNING *, (xmax = 0) AS inserted
)
SELECT 
    user_id,
    CASE WHEN inserted THEN 'created' ELSE 'updated' END as action,
    last_seen
FROM upserted;

Your application code can use this to log different messages for new sessions versus refreshed sessions.

Common Pitfalls and Best Practices

You must have a unique index. This will fail:

-- No unique constraint on email!
CREATE TABLE contacts (email TEXT, name TEXT);

INSERT INTO contacts (email, name)
VALUES ('test@example.com', 'Test')
ON CONFLICT (email) DO NOTHING;  -- ERROR!

PostgreSQL will throw: ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification. Create your indexes before using ON CONFLICT.

Index your conflict targets properly. Even with a unique constraint, poor indexing kills performance:

-- Good: B-tree index (default for UNIQUE)
CREATE UNIQUE INDEX idx_users_email ON users(email);

-- Also good: for case-insensitive conflicts
CREATE UNIQUE INDEX idx_users_email_lower ON users(LOWER(email));

Understand transaction isolation. Under READ COMMITTED (PostgreSQL’s default), ON CONFLICT can still fail if another transaction commits between your conflict check and your update. For critical operations, consider SERIALIZABLE isolation, though it has performance implications.

Batch your UPSERTs when possible:

INSERT INTO metrics (name, value, recorded_at)
VALUES 
    ('cpu_usage', 45.2, NOW()),
    ('memory_usage', 78.1, NOW()),
    ('disk_usage', 62.3, NOW())
ON CONFLICT (name)
DO UPDATE SET
    value = EXCLUDED.value,
    recorded_at = EXCLUDED.recorded_at;

Single multi-row inserts are much faster than multiple single-row inserts.

Real-World Use Case: Idempotent API Operations

UPSERT is essential for building idempotent APIs. Consider a payment processing webhook that might be called multiple times with the same transaction:

CREATE TABLE payment_events (
    external_transaction_id TEXT PRIMARY KEY,
    amount DECIMAL(10,2) NOT NULL,
    status TEXT NOT NULL,
    processed_at TIMESTAMP DEFAULT NOW(),
    retry_count INTEGER DEFAULT 0
);

-- In your API handler
INSERT INTO payment_events (
    external_transaction_id,
    amount,
    status,
    processed_at,
    retry_count
)
VALUES ($1, $2, $3, NOW(), 0)
ON CONFLICT (external_transaction_id)
DO UPDATE SET
    retry_count = payment_events.retry_count + 1,
    processed_at = NOW()
WHERE payment_events.status != 'completed'
RETURNING 
    external_transaction_id,
    status,
    retry_count,
    (xmax = 0) AS is_new_event;

This pattern ensures that:

  • First webhook delivery creates the record
  • Duplicate deliveries increment retry_count but don’t change completed transactions
  • Your application receives clear feedback about whether this was a new or duplicate event

Here’s how you might use this in a Node.js API:

app.post('/webhooks/payment', async (req, res) => {
    const { transactionId, amount, status } = req.body;
    
    const result = await db.query(`
        INSERT INTO payment_events (external_transaction_id, amount, status)
        VALUES ($1, $2, $3)
        ON CONFLICT (external_transaction_id)
        DO UPDATE SET retry_count = payment_events.retry_count + 1
        WHERE payment_events.status != 'completed'
        RETURNING *, (xmax = 0) AS is_new_event
    `, [transactionId, amount, status]);
    
    if (result.rows[0].is_new_event) {
        // Process new payment
        await processPayment(result.rows[0]);
        res.status(201).json({ message: 'Payment processed' });
    } else {
        // Duplicate webhook
        res.status(200).json({ message: 'Duplicate event ignored' });
    }
});

This approach eliminates the need for distributed locks or complex deduplication logic. The database handles concurrency for you, making your API naturally idempotent and safe under retry scenarios.

PostgreSQL’s ON CONFLICT transforms what used to require complex application logic into simple, declarative SQL. Master it, and you’ll write more robust, performant applications.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.