SQL - CHARINDEX() / POSITION() / INSTR()
Each database platform implements substring searching differently. Here's the fundamental syntax for each:
Key Insights
- CHARINDEX() (SQL Server/Azure SQL), POSITION() (PostgreSQL/standard SQL), and INSTR() (Oracle/MySQL) serve the same purpose: finding substring positions within strings, but with critical syntax differences that affect portability
- Return values differ across databases: CHARINDEX() and POSITION() return 0 when no match is found, while INSTR() returns 0 in MySQL but can return 0 or NULL in Oracle depending on usage
- Performance characteristics vary significantly—understanding when to use these functions versus full-text search or pattern matching can mean the difference between millisecond and multi-second query times
Function Syntax and Basic Usage
Each database platform implements substring searching differently. Here’s the fundamental syntax for each:
SQL Server - CHARINDEX()
-- Basic syntax: CHARINDEX(substring, string, [start_position])
SELECT CHARINDEX('world', 'hello world') AS position;
-- Returns: 7
SELECT CHARINDEX('x', 'hello world') AS position;
-- Returns: 0 (not found)
-- With start position
SELECT CHARINDEX('o', 'hello world', 5) AS position;
-- Returns: 8 (finds second 'o')
PostgreSQL - POSITION()
-- Basic syntax: POSITION(substring IN string)
SELECT POSITION('world' IN 'hello world') AS position;
-- Returns: 7
SELECT POSITION('x' IN 'hello world') AS position;
-- Returns: 0 (not found)
-- PostgreSQL also supports STRPOS() for C-style syntax
SELECT STRPOS('hello world', 'world') AS position;
-- Returns: 7
Oracle/MySQL - INSTR()
-- Basic syntax: INSTR(string, substring, [start_position], [occurrence])
SELECT INSTR('hello world', 'world') AS position FROM DUAL;
-- Returns: 7
SELECT INSTR('hello world', 'x') AS position FROM DUAL;
-- Returns: 0 (not found)
-- Find second occurrence
SELECT INSTR('hello world hello', 'hello', 1, 2) AS position FROM DUAL;
-- Returns: 13
Case Sensitivity and Collation Handling
Case sensitivity behavior depends on database collation settings and can cause subtle bugs in production.
SQL Server
-- Respects database collation
SELECT CHARINDEX('World', 'hello world') AS position;
-- Returns: 7 (case-insensitive by default with SQL_Latin1_General_CP1_CI_AS)
-- Force case-sensitive search
SELECT CHARINDEX('World', 'hello world' COLLATE SQL_Latin1_General_CP1_CS_AS) AS position;
-- returns: 0
-- Case-insensitive guarantee
SELECT CHARINDEX(LOWER('World'), LOWER('hello world')) AS position;
-- Returns: 7
PostgreSQL
-- Case-sensitive by default
SELECT POSITION('World' IN 'hello world') AS position;
-- Returns: 0
-- Case-insensitive with ILIKE or LOWER()
SELECT POSITION(LOWER('World') IN LOWER('hello world')) AS position;
-- Returns: 7
-- Using STRPOS with case-insensitive comparison
SELECT STRPOS(LOWER('hello world'), LOWER('World')) AS position;
-- Returns: 7
MySQL/Oracle
-- MySQL: depends on collation (typically case-insensitive)
SELECT INSTR('hello world', 'World') AS position;
-- Returns: 7 (utf8mb4_general_ci collation)
-- Force case-sensitive in MySQL
SELECT INSTR('hello world' COLLATE utf8mb4_bin, 'World') AS position;
-- Returns: 0
-- Oracle: case-sensitive by default
SELECT INSTR('hello world', 'World') FROM DUAL;
-- Returns: 0
Practical Applications in WHERE Clauses
These functions excel at filtering records based on substring presence or position.
-- SQL Server: Find emails from specific domain
SELECT
user_id,
email,
CHARINDEX('@gmail.com', email) AS gmail_position
FROM users
WHERE CHARINDEX('@gmail.com', email) > 0;
-- PostgreSQL: Extract records with specific path patterns
SELECT
file_id,
file_path
FROM documents
WHERE POSITION('/uploads/' IN file_path) > 0
AND POSITION('.pdf' IN file_path) > 0;
-- MySQL: Find products with SKU patterns
SELECT
product_id,
sku,
INSTR(sku, 'PRO-') AS prefix_pos
FROM products
WHERE INSTR(sku, 'PRO-') = 1 -- Must start with PRO-
AND INSTR(sku, '-2024') > 0; -- Must contain -2024
Combining with SUBSTRING for Data Extraction
The real power emerges when combining position functions with substring extraction.
SQL Server
-- Extract domain from email
SELECT
email,
SUBSTRING(
email,
CHARINDEX('@', email) + 1,
LEN(email)
) AS domain
FROM users
WHERE CHARINDEX('@', email) > 0;
-- Extract filename from full path
SELECT
file_path,
SUBSTRING(
file_path,
LEN(file_path) - CHARINDEX('\', REVERSE(file_path)) + 2,
LEN(file_path)
) AS filename
FROM files;
PostgreSQL
-- Parse URL components
SELECT
url,
SUBSTRING(url FROM 1 FOR POSITION('://' IN url) - 1) AS protocol,
SUBSTRING(url FROM POSITION('://' IN url) + 3) AS rest
FROM web_requests
WHERE POSITION('://' IN url) > 0;
-- Extract username from email
SELECT
email,
SUBSTRING(email, 1, POSITION('@' IN email) - 1) AS username
FROM users
WHERE POSITION('@' IN email) > 0;
Oracle/MySQL
-- Parse CSV-like data (first two values)
SELECT
data_string,
SUBSTR(data_string, 1, INSTR(data_string, ',') - 1) AS first_value,
SUBSTR(
data_string,
INSTR(data_string, ',') + 1,
INSTR(data_string, ',', 1, 2) - INSTR(data_string, ',') - 1
) AS second_value
FROM raw_data
WHERE INSTR(data_string, ',', 1, 2) > 0;
Performance Considerations and Indexing
String position functions are not sargable and prevent index usage in WHERE clauses.
-- SQL Server: Non-sargable (index not used)
SELECT * FROM users
WHERE CHARINDEX('@gmail.com', email) > 0;
-- Better: Use computed column with index
ALTER TABLE users
ADD is_gmail AS (CASE WHEN email LIKE '%@gmail.com' THEN 1 ELSE 0 END) PERSISTED;
CREATE INDEX idx_users_gmail ON users(is_gmail)
WHERE is_gmail = 1;
SELECT * FROM users WHERE is_gmail = 1;
-- PostgreSQL: Create functional index
CREATE INDEX idx_users_email_domain
ON users ((SUBSTRING(email FROM POSITION('@' IN email) + 1)));
-- Query using the indexed expression
SELECT * FROM users
WHERE SUBSTRING(email FROM POSITION('@' IN email) + 1) = 'gmail.com';
Cross-Database Compatibility Strategies
When building portable applications, abstract these differences.
-- Create compatibility function (PostgreSQL example)
CREATE OR REPLACE FUNCTION CHARINDEX(needle TEXT, haystack TEXT, start_pos INT DEFAULT 1)
RETURNS INT AS $$
BEGIN
RETURN STRPOS(SUBSTR(haystack, start_pos), needle) + start_pos - 1;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
-- MySQL compatibility wrapper for POSITION
DELIMITER //
CREATE FUNCTION POSITION_COMPAT(needle VARCHAR(255), haystack TEXT)
RETURNS INT
DETERMINISTIC
BEGIN
RETURN INSTR(haystack, needle);
END //
DELIMITER ;
Handling NULL and Edge Cases
Proper NULL handling prevents unexpected query results.
-- SQL Server: NULL handling
SELECT
CHARINDEX('test', NULL) AS result1, -- Returns NULL
CHARINDEX(NULL, 'test string') AS result2, -- Returns NULL
ISNULL(CHARINDEX('x', column_name), 0) AS safe_result
FROM table_name;
-- PostgreSQL: COALESCE for NULL safety
SELECT
COALESCE(POSITION('test' IN NULL), 0) AS result1,
COALESCE(POSITION(NULL IN 'test'), 0) AS result2
FROM table_name;
-- Empty string handling
SELECT
CHARINDEX('', 'test') AS empty_needle, -- SQL Server: 1
POSITION('' IN 'test') AS empty_needle_pg, -- PostgreSQL: 1
INSTR('test', '') AS empty_needle_ora; -- Oracle: 1
These functions are fundamental tools for string manipulation in SQL. Choose based on your database platform, understand collation implications, and always consider performance impact when using them in WHERE clauses on large datasets. For complex pattern matching or full-text requirements, evaluate specialized features like SQL Server’s LIKE operator with wildcards, PostgreSQL’s regular expressions, or dedicated full-text search capabilities.