How to Use String Functions in SQLite

SQLite includes a comprehensive set of string manipulation functions that let you transform, search, and analyze text data directly in your queries. While SQLite is known for being lightweight and...

Key Insights

  • SQLite’s string functions execute directly in the database layer, eliminating the need to pull data into application code for basic text manipulation and improving query performance
  • The concatenation operator || and functions like SUBSTR() and REPLACE() enable powerful data transformations in SELECT statements, making SQLite suitable for ETL-like operations without external tools
  • Understanding the case-sensitivity differences between LIKE (case-insensitive for ASCII) and GLOB (case-sensitive) prevents common bugs in pattern matching queries

Introduction to SQLite String Functions

SQLite includes a comprehensive set of string manipulation functions that let you transform, search, and analyze text data directly in your queries. While SQLite is known for being lightweight and embedded, it doesn’t skimp on string handling capabilities. These functions are essential for data cleaning, formatting output, implementing search features, and performing text-based filtering without round-tripping data through your application layer.

The key advantage is performance and simplicity. Instead of fetching thousands of rows and processing them in Python, JavaScript, or another language, you can transform data at the database level where it’s most efficient. This article covers the most useful string functions with practical examples you can implement immediately.

Basic String Manipulation Functions

The foundational string functions handle common transformations you’ll use constantly.

UPPER() and LOWER() convert strings to uppercase or lowercase. This is critical for case-insensitive comparisons:

-- Find products regardless of how the name was entered
SELECT * FROM products 
WHERE LOWER(product_name) = LOWER('widget pro');

-- Standardize output for reports
SELECT 
    UPPER(category) as category,
    COUNT(*) as product_count
FROM products
GROUP BY UPPER(category);

LENGTH() returns the number of characters in a string. Use it for validation or filtering:

-- Find potentially invalid phone numbers
SELECT customer_id, phone 
FROM customers 
WHERE LENGTH(phone) < 10;

-- Get articles by length category
SELECT 
    title,
    CASE 
        WHEN LENGTH(content) < 500 THEN 'short'
        WHEN LENGTH(content) < 2000 THEN 'medium'
        ELSE 'long'
    END as article_length
FROM blog_posts;

TRIM() removes whitespace from both ends of a string. The variants LTRIM() and RTRIM() trim only the left or right side:

-- Clean up user input data
UPDATE users 
SET email = TRIM(email),
    first_name = TRIM(first_name),
    last_name = TRIM(last_name)
WHERE email LIKE ' %' OR email LIKE '% ';

-- Remove specific characters
SELECT TRIM(price, '$') as numeric_price 
FROM products;

The second parameter to TRIM() lets you specify which characters to remove, not just whitespace.

Substring and Search Functions

When you need to extract portions of strings or locate text within them, these functions are indispensable.

SUBSTR() extracts a substring starting at a position for a specified length. SQLite uses 1-based indexing:

-- Extract area code from phone numbers (assuming format: 555-123-4567)
SELECT 
    customer_id,
    SUBSTR(phone, 1, 3) as area_code,
    SUBSTR(phone, 5, 3) as exchange,
    SUBSTR(phone, 9, 4) as line_number
FROM customers;

-- Get the first 50 characters for preview text
SELECT 
    title,
    SUBSTR(content, 1, 50) || '...' as preview
FROM articles;

INSTR() returns the position of the first occurrence of a substring, or 0 if not found:

-- Find the position of '@' in email addresses
SELECT 
    email,
    INSTR(email, '@') as at_position
FROM users;

-- Extract username from email (everything before @)
SELECT 
    email,
    SUBSTR(email, 1, INSTR(email, '@') - 1) as username
FROM users
WHERE INSTR(email, '@') > 0;

-- Validate email format (must contain @)
SELECT * FROM users 
WHERE INSTR(email, '@') = 0;

REPLACE() substitutes all occurrences of a substring with another string:

-- Normalize phone number format
UPDATE customers 
SET phone = REPLACE(REPLACE(REPLACE(phone, '-', ''), '(', ''), ')', '');

-- Change domain in email addresses
UPDATE users 
SET email = REPLACE(email, '@oldcompany.com', '@newcompany.com')
WHERE email LIKE '%@oldcompany.com';

-- Remove unwanted characters from product codes
SELECT 
    product_code,
    REPLACE(REPLACE(product_code, '-', ''), ' ', '') as clean_code
FROM inventory;

Pattern Matching with LIKE and GLOB

SQLite provides two pattern matching operators with different behaviors.

LIKE is case-insensitive for ASCII characters and uses % (matches any sequence) and _ (matches single character):

-- Find customers whose names start with 'A'
SELECT * FROM customers 
WHERE last_name LIKE 'A%';

-- Find emails from specific domain
SELECT * FROM users 
WHERE email LIKE '%@gmail.com';

-- Find phone numbers with specific pattern (555-1___)
SELECT * FROM customers 
WHERE phone LIKE '555-1___';

-- Search for products containing 'pro' anywhere in the name
SELECT * FROM products 
WHERE product_name LIKE '%pro%';

GLOB is case-sensitive and uses Unix-style wildcards: * (any sequence), ? (single character), and [] (character ranges):

-- Case-sensitive search for products starting with capital P
SELECT * FROM products 
WHERE product_name GLOB 'P*';

-- Find codes with specific pattern (must start with capital letter)
SELECT * FROM items 
WHERE item_code GLOB '[A-Z][0-9][0-9][0-9]';

-- Match files with specific extensions
SELECT * FROM files 
WHERE filename GLOB '*.{jpg,png,gif}';

The critical difference: LIKE 'abc' matches ‘ABC’, but GLOB 'abc' does not. Choose based on your case-sensitivity requirements.

Advanced String Concatenation

Building strings from multiple columns is a common requirement for reports and formatted output.

The || operator concatenates strings:

-- Build full names
SELECT 
    first_name || ' ' || last_name as full_name,
    email
FROM users;

-- Create formatted addresses
SELECT 
    street || ', ' || city || ', ' || state || ' ' || zip as full_address
FROM addresses;

-- Handle NULL values with COALESCE
SELECT 
    first_name || ' ' || COALESCE(middle_name || ' ', '') || last_name as full_name
FROM users;

PRINTF() provides C-style formatted string output:

-- Format prices with currency
SELECT 
    product_name,
    PRINTF('$%.2f', price) as formatted_price
FROM products;

-- Create zero-padded IDs
SELECT 
    PRINTF('ORD-%06d', order_id) as formatted_order_id,
    order_date
FROM orders;

-- Build formatted labels
SELECT 
    PRINTF('%s (%d items)', category, item_count) as category_label
FROM category_summary;

The PRINTF() function supports standard format specifiers: %s (string), %d (integer), %f (float), and modifiers for padding and precision.

Practical Use Cases and Performance Tips

Combine multiple string functions for complex transformations:

-- Clean and standardize product names
UPDATE products 
SET product_name = UPPER(TRIM(REPLACE(REPLACE(product_name, '  ', ' '), '-', ' ')))
WHERE product_name != UPPER(TRIM(REPLACE(REPLACE(product_name, '  ', ' '), '-', ' ')));

-- Create searchable full-text column
ALTER TABLE contacts ADD COLUMN search_text TEXT;

UPDATE contacts 
SET search_text = LOWER(
    first_name || ' ' || 
    last_name || ' ' || 
    COALESCE(company, '') || ' ' || 
    COALESCE(email, '')
);

CREATE INDEX idx_contacts_search ON contacts(search_text);

Performance considerations:

String functions in WHERE clauses prevent index usage. This query won’t use an index on email:

-- Inefficient: function prevents index use
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

Instead, store normalized data or use generated columns:

-- Better: add computed column
ALTER TABLE users ADD COLUMN email_lower TEXT;
UPDATE users SET email_lower = LOWER(email);
CREATE INDEX idx_users_email_lower ON users(email_lower);

-- Now this uses the index
SELECT * FROM users WHERE email_lower = 'user@example.com';

For SQLite 3.31.0+, use generated columns:

CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    email TEXT NOT NULL,
    email_lower TEXT GENERATED ALWAYS AS (LOWER(email)) STORED
);

CREATE INDEX idx_users_email_lower ON users(email_lower);

Data validation with string functions:

-- Find potentially invalid data
SELECT * FROM users 
WHERE LENGTH(TRIM(email)) = 0 
   OR INSTR(email, '@') = 0
   OR INSTR(email, '.') = 0
   OR email LIKE '%@%@%';  -- Multiple @ signs

-- Identify records needing cleanup
SELECT 
    COUNT(*) as records_with_extra_spaces
FROM products 
WHERE product_name LIKE '%  %'  -- Double spaces
   OR product_name LIKE ' %'    -- Leading space
   OR product_name LIKE '% ';   -- Trailing space

SQLite’s string functions provide everything you need for text manipulation without external dependencies. Master these functions and you’ll write cleaner queries that perform transformations at the optimal layer—right where your data lives.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.