SQL - IN Operator with Examples

The IN operator tests whether a value matches any value in a specified list or subquery result. It returns TRUE if the value exists in the set, FALSE otherwise, and NULL if comparing against NULL...

Key Insights

  • The IN operator provides a cleaner, more maintainable alternative to multiple OR conditions when filtering against a list of discrete values
  • IN works with subqueries to filter based on dynamic result sets, enabling powerful cross-table filtering without explicit joins
  • Understanding IN’s performance characteristics—including index usage and comparison with EXISTS—is critical for writing efficient queries at scale

Understanding the IN Operator

The IN operator tests whether a value matches any value in a specified list or subquery result. It returns TRUE if the value exists in the set, FALSE otherwise, and NULL if comparing against NULL values.

Basic syntax:

SELECT column1, column2
FROM table_name
WHERE column_name IN (value1, value2, value3);

This is functionally equivalent to:

SELECT column1, column2
FROM table_name
WHERE column_name = value1 
   OR column_name = value2 
   OR column_name = value3;

The IN operator significantly improves readability when dealing with multiple values.

Filtering with Literal Values

The most straightforward use case involves filtering against a static list of values.

-- Find orders from specific regions
SELECT order_id, customer_id, region, total_amount
FROM orders
WHERE region IN ('North', 'South', 'West');

-- Filter products by category IDs
SELECT product_id, product_name, category_id, price
FROM products
WHERE category_id IN (1, 3, 5, 7, 9);

-- Find employees hired in specific years
SELECT employee_id, first_name, last_name, hire_date
FROM employees
WHERE YEAR(hire_date) IN (2020, 2021, 2022);

String values require quotes, while numeric values don’t. The IN operator handles type checking automatically.

Using IN with Subqueries

The real power of IN emerges when combined with subqueries, allowing dynamic filtering based on related data.

-- Find customers who placed orders in the last 30 days
SELECT customer_id, customer_name, email
FROM customers
WHERE customer_id IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
);

-- Get products that have never been ordered
SELECT product_id, product_name, stock_quantity
FROM products
WHERE product_id NOT IN (
    SELECT DISTINCT product_id
    FROM order_items
    WHERE product_id IS NOT NULL
);

-- Find employees who manage other employees
SELECT employee_id, first_name, last_name, title
FROM employees
WHERE employee_id IN (
    SELECT DISTINCT manager_id
    FROM employees
    WHERE manager_id IS NOT NULL
);

Note the NULL check in the NOT IN example—this is critical and we’ll explore why shortly.

The NOT IN Operator

NOT IN returns TRUE when the value doesn’t exist in the specified set. It’s the logical inverse of IN.

-- Find products not in specific categories
SELECT product_id, product_name, category_id
FROM products
WHERE category_id NOT IN (2, 4, 6);

-- Get customers who haven't ordered in 2023
SELECT customer_id, customer_name, last_contact_date
FROM customers
WHERE customer_id NOT IN (
    SELECT customer_id
    FROM orders
    WHERE YEAR(order_date) = 2023
);

-- Exclude test accounts
SELECT user_id, username, email, created_at
FROM users
WHERE email NOT IN (
    'test@example.com',
    'admin@example.com',
    'demo@example.com'
);

The NULL Problem with NOT IN

NOT IN behaves unexpectedly when the list contains NULL values. This is one of the most common SQL gotchas.

-- This returns NO rows if any product_id in order_items is NULL
SELECT product_id, product_name
FROM products
WHERE product_id NOT IN (
    SELECT product_id FROM order_items
);

-- Correct approach: filter out NULLs
SELECT product_id, product_name
FROM products
WHERE product_id NOT IN (
    SELECT product_id 
    FROM order_items 
    WHERE product_id IS NOT NULL
);

Why does this happen? When SQL compares a value against NULL using NOT IN, the result is UNKNOWN (not FALSE). Since the WHERE clause requires TRUE, rows are excluded.

The safe alternatives:

-- Using NOT EXISTS (handles NULLs correctly)
SELECT p.product_id, p.product_name
FROM products p
WHERE NOT EXISTS (
    SELECT 1 
    FROM order_items oi 
    WHERE oi.product_id = p.product_id
);

-- Using LEFT JOIN
SELECT p.product_id, p.product_name
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
WHERE oi.product_id IS NULL;

IN vs EXISTS: Performance Considerations

For subqueries, EXISTS often outperforms IN, especially with large datasets.

-- Using IN (may be slower)
SELECT c.customer_id, c.customer_name
FROM customers c
WHERE c.customer_id IN (
    SELECT o.customer_id
    FROM orders o
    WHERE o.total_amount > 1000
);

-- Using EXISTS (typically faster)
SELECT c.customer_id, c.customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
      AND o.total_amount > 1000
);

EXISTS stops searching once it finds a match, while IN may evaluate the entire subquery. Modern query optimizers often rewrite IN to EXISTS, but EXISTS gives you explicit control.

Use IN when:

  • Working with small, static lists of values
  • The subquery returns a small result set
  • Readability is paramount

Use EXISTS when:

  • Dealing with large subqueries
  • The subquery might return many rows per outer row
  • You need guaranteed NULL handling

Combining IN with Other Conditions

IN integrates seamlessly with other WHERE clause conditions using AND/OR logic.

-- Multiple IN conditions
SELECT product_id, product_name, category_id, price
FROM products
WHERE category_id IN (1, 2, 3)
  AND price IN (9.99, 19.99, 29.99, 49.99);

-- IN with range conditions
SELECT order_id, order_date, status, total_amount
FROM orders
WHERE status IN ('pending', 'processing')
  AND order_date >= '2023-01-01'
  AND total_amount > 100;

-- Complex filtering
SELECT e.employee_id, e.first_name, e.last_name, e.department_id
FROM employees e
WHERE e.department_id IN (
    SELECT department_id 
    FROM departments 
    WHERE location IN ('New York', 'Boston', 'Chicago')
)
AND e.hire_date >= '2020-01-01'
AND e.salary > 50000;

Case Sensitivity and Collation

String comparisons with IN respect the database’s collation settings.

-- Case-insensitive by default in MySQL (depending on collation)
SELECT product_name
FROM products
WHERE product_name IN ('Widget', 'WIDGET', 'widget');

-- Force case-sensitive comparison (MySQL)
SELECT product_name
FROM products
WHERE product_name COLLATE utf8mb4_bin IN ('Widget', 'WIDGET');

-- PostgreSQL is case-sensitive by default
SELECT product_name
FROM products
WHERE LOWER(product_name) IN ('widget', 'gadget', 'tool');

Practical Patterns

Dynamic filtering in application code:

-- Building IN clauses programmatically (use parameterized queries)
-- Python example with psycopg2
category_ids = [1, 3, 5, 7]
placeholders = ','.join(['%s'] * len(category_ids))
query = f"SELECT * FROM products WHERE category_id IN ({placeholders})"
cursor.execute(query, category_ids)

Pagination with IN:

-- Get specific page of results by IDs
SELECT * FROM products
WHERE product_id IN (101, 102, 103, 104, 105)
ORDER BY product_name;

Audit queries:

-- Find records modified by specific users
SELECT table_name, record_id, modified_date, modified_by
FROM audit_log
WHERE modified_by IN (
    SELECT user_id 
    FROM users 
    WHERE role = 'admin'
)
AND modified_date >= CURRENT_DATE - INTERVAL '7 days';

The IN operator is a fundamental SQL tool that simplifies queries and improves maintainability. Master its nuances—particularly NULL handling and performance characteristics—to write robust, efficient database queries.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.