How to Use String Functions in SQLite
SQLite includes a comprehensive set of string manipulation functions that let you transform, search, and analyze text data directly in your queries. While SQLite is known for being lightweight and...
Key Insights
- SQLite’s string functions execute directly in the database layer, eliminating the need to pull data into application code for basic text manipulation and improving query performance
- The concatenation operator
||and functions likeSUBSTR()andREPLACE()enable powerful data transformations in SELECT statements, making SQLite suitable for ETL-like operations without external tools - Understanding the case-sensitivity differences between
LIKE(case-insensitive for ASCII) andGLOB(case-sensitive) prevents common bugs in pattern matching queries
Introduction to SQLite String Functions
SQLite includes a comprehensive set of string manipulation functions that let you transform, search, and analyze text data directly in your queries. While SQLite is known for being lightweight and embedded, it doesn’t skimp on string handling capabilities. These functions are essential for data cleaning, formatting output, implementing search features, and performing text-based filtering without round-tripping data through your application layer.
The key advantage is performance and simplicity. Instead of fetching thousands of rows and processing them in Python, JavaScript, or another language, you can transform data at the database level where it’s most efficient. This article covers the most useful string functions with practical examples you can implement immediately.
Basic String Manipulation Functions
The foundational string functions handle common transformations you’ll use constantly.
UPPER() and LOWER() convert strings to uppercase or lowercase. This is critical for case-insensitive comparisons:
-- Find products regardless of how the name was entered
SELECT * FROM products
WHERE LOWER(product_name) = LOWER('widget pro');
-- Standardize output for reports
SELECT
UPPER(category) as category,
COUNT(*) as product_count
FROM products
GROUP BY UPPER(category);
LENGTH() returns the number of characters in a string. Use it for validation or filtering:
-- Find potentially invalid phone numbers
SELECT customer_id, phone
FROM customers
WHERE LENGTH(phone) < 10;
-- Get articles by length category
SELECT
title,
CASE
WHEN LENGTH(content) < 500 THEN 'short'
WHEN LENGTH(content) < 2000 THEN 'medium'
ELSE 'long'
END as article_length
FROM blog_posts;
TRIM() removes whitespace from both ends of a string. The variants LTRIM() and RTRIM() trim only the left or right side:
-- Clean up user input data
UPDATE users
SET email = TRIM(email),
first_name = TRIM(first_name),
last_name = TRIM(last_name)
WHERE email LIKE ' %' OR email LIKE '% ';
-- Remove specific characters
SELECT TRIM(price, '$') as numeric_price
FROM products;
The second parameter to TRIM() lets you specify which characters to remove, not just whitespace.
Substring and Search Functions
When you need to extract portions of strings or locate text within them, these functions are indispensable.
SUBSTR() extracts a substring starting at a position for a specified length. SQLite uses 1-based indexing:
-- Extract area code from phone numbers (assuming format: 555-123-4567)
SELECT
customer_id,
SUBSTR(phone, 1, 3) as area_code,
SUBSTR(phone, 5, 3) as exchange,
SUBSTR(phone, 9, 4) as line_number
FROM customers;
-- Get the first 50 characters for preview text
SELECT
title,
SUBSTR(content, 1, 50) || '...' as preview
FROM articles;
INSTR() returns the position of the first occurrence of a substring, or 0 if not found:
-- Find the position of '@' in email addresses
SELECT
email,
INSTR(email, '@') as at_position
FROM users;
-- Extract username from email (everything before @)
SELECT
email,
SUBSTR(email, 1, INSTR(email, '@') - 1) as username
FROM users
WHERE INSTR(email, '@') > 0;
-- Validate email format (must contain @)
SELECT * FROM users
WHERE INSTR(email, '@') = 0;
REPLACE() substitutes all occurrences of a substring with another string:
-- Normalize phone number format
UPDATE customers
SET phone = REPLACE(REPLACE(REPLACE(phone, '-', ''), '(', ''), ')', '');
-- Change domain in email addresses
UPDATE users
SET email = REPLACE(email, '@oldcompany.com', '@newcompany.com')
WHERE email LIKE '%@oldcompany.com';
-- Remove unwanted characters from product codes
SELECT
product_code,
REPLACE(REPLACE(product_code, '-', ''), ' ', '') as clean_code
FROM inventory;
Pattern Matching with LIKE and GLOB
SQLite provides two pattern matching operators with different behaviors.
LIKE is case-insensitive for ASCII characters and uses % (matches any sequence) and _ (matches single character):
-- Find customers whose names start with 'A'
SELECT * FROM customers
WHERE last_name LIKE 'A%';
-- Find emails from specific domain
SELECT * FROM users
WHERE email LIKE '%@gmail.com';
-- Find phone numbers with specific pattern (555-1___)
SELECT * FROM customers
WHERE phone LIKE '555-1___';
-- Search for products containing 'pro' anywhere in the name
SELECT * FROM products
WHERE product_name LIKE '%pro%';
GLOB is case-sensitive and uses Unix-style wildcards: * (any sequence), ? (single character), and [] (character ranges):
-- Case-sensitive search for products starting with capital P
SELECT * FROM products
WHERE product_name GLOB 'P*';
-- Find codes with specific pattern (must start with capital letter)
SELECT * FROM items
WHERE item_code GLOB '[A-Z][0-9][0-9][0-9]';
-- Match files with specific extensions
SELECT * FROM files
WHERE filename GLOB '*.{jpg,png,gif}';
The critical difference: LIKE 'abc' matches ‘ABC’, but GLOB 'abc' does not. Choose based on your case-sensitivity requirements.
Advanced String Concatenation
Building strings from multiple columns is a common requirement for reports and formatted output.
The || operator concatenates strings:
-- Build full names
SELECT
first_name || ' ' || last_name as full_name,
email
FROM users;
-- Create formatted addresses
SELECT
street || ', ' || city || ', ' || state || ' ' || zip as full_address
FROM addresses;
-- Handle NULL values with COALESCE
SELECT
first_name || ' ' || COALESCE(middle_name || ' ', '') || last_name as full_name
FROM users;
PRINTF() provides C-style formatted string output:
-- Format prices with currency
SELECT
product_name,
PRINTF('$%.2f', price) as formatted_price
FROM products;
-- Create zero-padded IDs
SELECT
PRINTF('ORD-%06d', order_id) as formatted_order_id,
order_date
FROM orders;
-- Build formatted labels
SELECT
PRINTF('%s (%d items)', category, item_count) as category_label
FROM category_summary;
The PRINTF() function supports standard format specifiers: %s (string), %d (integer), %f (float), and modifiers for padding and precision.
Practical Use Cases and Performance Tips
Combine multiple string functions for complex transformations:
-- Clean and standardize product names
UPDATE products
SET product_name = UPPER(TRIM(REPLACE(REPLACE(product_name, ' ', ' '), '-', ' ')))
WHERE product_name != UPPER(TRIM(REPLACE(REPLACE(product_name, ' ', ' '), '-', ' ')));
-- Create searchable full-text column
ALTER TABLE contacts ADD COLUMN search_text TEXT;
UPDATE contacts
SET search_text = LOWER(
first_name || ' ' ||
last_name || ' ' ||
COALESCE(company, '') || ' ' ||
COALESCE(email, '')
);
CREATE INDEX idx_contacts_search ON contacts(search_text);
Performance considerations:
String functions in WHERE clauses prevent index usage. This query won’t use an index on email:
-- Inefficient: function prevents index use
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
Instead, store normalized data or use generated columns:
-- Better: add computed column
ALTER TABLE users ADD COLUMN email_lower TEXT;
UPDATE users SET email_lower = LOWER(email);
CREATE INDEX idx_users_email_lower ON users(email_lower);
-- Now this uses the index
SELECT * FROM users WHERE email_lower = 'user@example.com';
For SQLite 3.31.0+, use generated columns:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT NOT NULL,
email_lower TEXT GENERATED ALWAYS AS (LOWER(email)) STORED
);
CREATE INDEX idx_users_email_lower ON users(email_lower);
Data validation with string functions:
-- Find potentially invalid data
SELECT * FROM users
WHERE LENGTH(TRIM(email)) = 0
OR INSTR(email, '@') = 0
OR INSTR(email, '.') = 0
OR email LIKE '%@%@%'; -- Multiple @ signs
-- Identify records needing cleanup
SELECT
COUNT(*) as records_with_extra_spaces
FROM products
WHERE product_name LIKE '% %' -- Double spaces
OR product_name LIKE ' %' -- Leading space
OR product_name LIKE '% '; -- Trailing space
SQLite’s string functions provide everything you need for text manipulation without external dependencies. Master these functions and you’ll write cleaner queries that perform transformations at the optimal layer—right where your data lives.