SQL - REPLACE() Function
The REPLACE() function follows a straightforward syntax across most SQL databases:
Key Insights
- REPLACE() substitutes all occurrences of a specified substring within a string, making it essential for data cleaning, formatting corrections, and bulk text transformations in SQL queries
- Unlike UPDATE statements that modify table data, REPLACE() performs non-destructive string manipulation in SELECT queries, though it can be combined with UPDATE for permanent changes
- Performance considerations matter: REPLACE() operations on large text columns without proper indexing can significantly impact query execution time, especially when nested or used in WHERE clauses
Basic Syntax and Usage
The REPLACE() function follows a straightforward syntax across most SQL databases:
REPLACE(original_string, substring_to_replace, replacement_substring)
Here’s a practical example cleaning phone number formats:
SELECT
customer_name,
phone_number AS original,
REPLACE(phone_number, '-', '') AS cleaned
FROM customers;
-- Output:
-- customer_name | original | cleaned
-- John Smith | 555-123-4567 | 5551234567
-- Jane Doe | 555-987-6543 | 5559876543
All occurrences of the search string are replaced. If the substring doesn’t exist, the original string returns unchanged:
SELECT REPLACE('Hello World', 'xyz', 'ABC');
-- Returns: 'Hello World'
Data Cleaning Operations
REPLACE() excels at cleaning inconsistent data. Remove unwanted characters, standardize formats, or correct common data entry mistakes:
-- Remove multiple types of whitespace characters
SELECT
product_code,
REPLACE(REPLACE(REPLACE(product_code, ' ', ''), CHAR(9), ''), CHAR(10), '') AS cleaned_code
FROM products;
-- Standardize currency symbols
SELECT
product_name,
price,
REPLACE(REPLACE(price_string, '$', ''), ',', '') AS numeric_price
FROM product_catalog;
For email domain migrations or corrections:
SELECT
email,
REPLACE(email, '@oldcompany.com', '@newcompany.com') AS updated_email
FROM employees
WHERE email LIKE '%@oldcompany.com';
Chaining Multiple REPLACE() Functions
Complex transformations require nested REPLACE() calls. While not elegant, this approach handles multiple substitutions:
-- Clean and standardize address data
SELECT
address,
REPLACE(
REPLACE(
REPLACE(
REPLACE(address, 'Street', 'St'),
'Avenue', 'Ave'
),
'Boulevard', 'Blvd'
),
'Road', 'Rd'
) AS standardized_address
FROM locations;
A more readable approach using CTEs:
WITH step1 AS (
SELECT id, REPLACE(description, 'colour', 'color') AS description
FROM products
),
step2 AS (
SELECT id, REPLACE(description, 'centre', 'center') AS description
FROM step1
),
step3 AS (
SELECT id, REPLACE(description, 'organised', 'organized') AS description
FROM step2
)
SELECT * FROM step3;
Updating Table Data
Combine REPLACE() with UPDATE statements for permanent modifications:
-- Fix common typos in product descriptions
UPDATE products
SET description = REPLACE(description, 'teh', 'the')
WHERE description LIKE '%teh%';
-- Update multiple columns
UPDATE customer_contacts
SET
phone = REPLACE(phone, '(', ''),
phone = REPLACE(phone, ')', ''),
phone = REPLACE(phone, ' ', '')
WHERE phone LIKE '%(%';
Always test with SELECT before executing UPDATE:
-- Test first
SELECT
id,
description,
REPLACE(description, 'old_term', 'new_term') AS preview
FROM products
WHERE description LIKE '%old_term%';
-- Then update
UPDATE products
SET description = REPLACE(description, 'old_term', 'new_term')
WHERE description LIKE '%old_term%';
Case Sensitivity Considerations
REPLACE() behavior varies by database system and collation settings:
-- MySQL (case-insensitive by default with utf8_general_ci)
SELECT REPLACE('Hello World', 'hello', 'Hi');
-- Returns: 'Hi World'
-- PostgreSQL (case-sensitive)
SELECT REPLACE('Hello World', 'hello', 'Hi');
-- Returns: 'Hello World' (no match)
-- Case-insensitive replacement in PostgreSQL
SELECT REPLACE(LOWER('Hello World'), 'hello', 'hi');
-- Returns: 'hi world'
For case-insensitive replacements while preserving original case in PostgreSQL:
SELECT
OVERLAY(original_text
PLACING 'replacement'
FROM POSITION(LOWER('search') IN LOWER(original_text))
FOR LENGTH('search')
) AS result
FROM table_name;
Performance Optimization
REPLACE() operations on large datasets require optimization strategies:
-- Inefficient: REPLACE in WHERE clause forces full table scan
SELECT * FROM logs
WHERE REPLACE(message, ' ', '') LIKE '%error%';
-- Better: Use indexed column for filtering first
SELECT * FROM logs
WHERE message LIKE '% error %'
OR message LIKE '%error%'
OR message LIKE '%error %';
-- Best: Use full-text search if available
SELECT * FROM logs
WHERE MATCH(message) AGAINST('error' IN BOOLEAN MODE);
Create computed columns for frequently used REPLACE() operations:
-- Add computed column (SQL Server)
ALTER TABLE products
ADD cleaned_sku AS REPLACE(REPLACE(sku, '-', ''), ' ', '') PERSISTED;
CREATE INDEX idx_cleaned_sku ON products(cleaned_sku);
-- Query using indexed computed column
SELECT * FROM products
WHERE cleaned_sku = 'ABC123';
Handling NULL Values
REPLACE() returns NULL when the input string is NULL:
SELECT REPLACE(NULL, 'a', 'b');
-- Returns: NULL
-- Handle NULLs with COALESCE
SELECT
product_name,
COALESCE(REPLACE(description, 'old', 'new'), 'No description') AS description
FROM products;
Advanced Pattern Replacements
For complex patterns, combine REPLACE() with other string functions:
-- Remove all digits
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(text, '0', ''),
'1', ''),
'2', ''),
'3', ''),
'4', ''),
'5', ''),
'6', ''),
'7', ''),
'8', ''),
'9', '') AS no_digits
FROM data_table;
Better approach using REGEXP_REPLACE (PostgreSQL, MySQL 8.0+):
SELECT REGEXP_REPLACE(text, '[0-9]', '', 'g') AS no_digits
FROM data_table;
Database-Specific Variations
SQL Server supports additional parameters:
-- SQL Server doesn't support additional parameters in REPLACE
-- Use STUFF for position-based replacement
SELECT STUFF('Hello World', 7, 5, 'SQL Server');
-- Returns: 'Hello SQL Server'
Oracle uses REPLACE identically but offers TRANSLATE for character-by-character replacement:
-- Oracle TRANSLATE for multiple single-character replacements
SELECT TRANSLATE('ABC123', '123', 'XYZ') FROM DUAL;
-- Returns: 'ABCXYZ'
PostgreSQL offers TRANSLATE and REGEXP_REPLACE for advanced scenarios:
-- Multiple character replacements
SELECT TRANSLATE('Hello World', 'elo', '310');
-- Returns: 'H311o Wor1d'
Common Pitfalls
Empty string replacements can cause unexpected results:
-- Replacing with empty string removes the substring
SELECT REPLACE('Hello World', 'World', '');
-- Returns: 'Hello '
-- Be explicit about spacing
SELECT TRIM(REPLACE('Hello World', 'World', ''));
-- Returns: 'Hello'
Order matters in chained replacements:
-- Wrong order
SELECT REPLACE(REPLACE('AAA', 'A', 'B'), 'B', 'C');
-- Returns: 'CCC' (not 'BBB')
-- Replacements are sequential
REPLACE() operates on the result of previous replacements, which can lead to unintended cascading changes in nested operations.