SQL - LPAD() and RPAD()

LPAD() and RPAD() are string manipulation functions that pad a string to a specified length by adding characters to the left (LPAD) or right (RPAD) side. The syntax is consistent across most SQL...

Key Insights

  • LPAD() and RPAD() pad strings to a specified length by adding characters to the left or right side, essential for formatting fixed-width reports and aligning data output
  • Both functions take three parameters: the original string, target length, and pad character, with behavior varying significantly across database systems when strings exceed target length
  • Performance considerations matter when padding large result sets—use these functions in the SELECT clause rather than repeatedly in WHERE conditions to avoid index issues

Understanding LPAD and RPAD Fundamentals

LPAD() and RPAD() are string manipulation functions that pad a string to a specified length by adding characters to the left (LPAD) or right (RPAD) side. The syntax is consistent across most SQL databases:

LPAD(string, length, pad_string)
RPAD(string, length, pad_string)

Here’s a basic example demonstrating both functions:

SELECT 
    LPAD('42', 5, '0') AS left_padded,
    RPAD('42', 5, '0') AS right_padded;

Result:

left_padded | right_padded
------------|-------------
00042       | 42000

The functions fill empty space with the pad character until the total string length reaches the specified target. If the original string already meets or exceeds the target length, behavior varies by database system.

Practical Use Cases for Data Formatting

Zero-Padding Numeric IDs

When generating invoice numbers, order IDs, or reference codes that need consistent formatting:

SELECT 
    order_id,
    LPAD(CAST(order_id AS VARCHAR), 8, '0') AS formatted_order_id,
    customer_name
FROM orders
WHERE order_date >= '2024-01-01';

Result:

order_id | formatted_order_id | customer_name
---------|-------------------|---------------
123      | 00000123          | Acme Corp
4567     | 00004567          | TechStart Inc

Creating Fixed-Width Reports

Legacy systems and file exports often require fixed-width format where each field occupies a specific number of characters:

SELECT 
    RPAD(COALESCE(customer_name, ''), 30, ' ') ||
    LPAD(CAST(account_balance AS VARCHAR), 12, ' ') ||
    RPAD(account_status, 10, ' ') AS fixed_width_record
FROM customer_accounts
WHERE account_type = 'PREMIUM';

This produces records where customer names occupy 30 characters, balances occupy 12 characters (right-aligned), and status occupies 10 characters.

Masking Sensitive Data

Partial masking of credit card numbers or account details:

SELECT 
    customer_id,
    RPAD(LEFT(credit_card, 4), 16, '*') AS masked_card
FROM payment_methods;

Result:

customer_id | masked_card
------------|----------------
1001        | 4532************
1002        | 5412************

Database-Specific Behavior and Syntax

PostgreSQL

PostgreSQL implements LPAD and RPAD with standard syntax. When the string exceeds target length, it truncates from the right:

SELECT 
    LPAD('toolongstring', 6, 'X') AS truncated,
    RPAD('short', 10, '-') AS padded;

Result:

truncated | padded
----------|----------
toolon    | short-----

PostgreSQL also allows multi-character pad strings:

SELECT LPAD('data', 15, '=-') AS decorated;

Result:

decorated
-----------------
=-=-=-=-=-=data

MySQL

MySQL syntax matches PostgreSQL, but handles NULL values differently:

SELECT 
    LPAD(NULL, 10, '0') AS null_result,
    LPAD('test', 10, NULL) AS null_pad;

Result:

null_result | null_pad
------------|----------
NULL        | NULL

Any NULL parameter returns NULL for the entire expression.

Oracle

Oracle uses identical syntax but has different truncation behavior. When the original string exceeds the target length, Oracle returns the original string unchanged:

SELECT 
    LPAD('verylongstring', 6, 'X') AS oracle_behavior
FROM dual;

Result:

oracle_behavior
-----------------
verylongstring

This is a critical difference when porting queries between databases.

SQL Server

SQL Server doesn’t have native LPAD/RPAD functions prior to SQL Server 2022. Implement them using REPLICATE and LEN:

-- LPAD equivalent
SELECT 
    REPLICATE('0', 8 - LEN(CAST(order_id AS VARCHAR))) + 
    CAST(order_id AS VARCHAR) AS padded_id
FROM orders;

-- RPAD equivalent
SELECT 
    customer_name + 
    REPLICATE(' ', 30 - LEN(customer_name)) AS padded_name
FROM customers;

SQL Server 2022 and later include STRING_SPLIT, but for true LPAD/RPAD, use user-defined functions:

CREATE FUNCTION dbo.LPAD
(
    @string VARCHAR(MAX),
    @length INT,
    @pad_char VARCHAR(1)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
    RETURN CASE 
        WHEN LEN(@string) >= @length THEN LEFT(@string, @length)
        ELSE REPLICATE(@pad_char, @length - LEN(@string)) + @string
    END
END;

Advanced Patterns and Combinations

Dynamic Padding Based on Maximum Length

When you need consistent padding across a result set based on the longest value:

WITH max_length AS (
    SELECT MAX(LENGTH(product_code)) AS max_len
    FROM products
)
SELECT 
    LPAD(product_code, max_len + 2, '0') AS normalized_code,
    product_name
FROM products
CROSS JOIN max_length;

Combining with TRIM for Data Cleanup

Remove existing whitespace before padding to ensure consistent results:

SELECT 
    RPAD(TRIM(customer_name), 40, '.') AS formatted_name,
    LPAD(TRIM(CAST(amount AS VARCHAR)), 10, ' ') AS formatted_amount
FROM transactions
WHERE transaction_date = CURRENT_DATE;

Building Hierarchical Displays

Create visual tree structures using padding:

WITH RECURSIVE org_tree AS (
    SELECT 
        employee_id,
        manager_id,
        employee_name,
        0 AS level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    SELECT 
        e.employee_id,
        e.manager_id,
        e.employee_name,
        ot.level + 1
    FROM employees e
    JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT 
    LPAD('', level * 4, ' ') || employee_name AS org_chart
FROM org_tree
ORDER BY level, employee_name;

Result:

org_chart
--------------------
CEO Smith
    VP Jones
        Manager Brown
        Manager Davis
    VP Wilson

Performance Considerations

Padding functions are computationally inexpensive for individual rows but can impact performance on large result sets. Key optimization strategies:

Avoid in WHERE clauses: Padding in filter conditions prevents index usage:

-- Bad: Prevents index use
SELECT * FROM orders 
WHERE LPAD(CAST(order_id AS VARCHAR), 8, '0') = '00001234';

-- Good: Use original value
SELECT * FROM orders 
WHERE order_id = 1234;

Apply padding in presentation layer when possible: If your application framework can handle formatting, do it there rather than in SQL to reduce database CPU load.

Use computed columns for frequently accessed padded values: In SQL Server and other systems supporting computed columns:

ALTER TABLE products
ADD formatted_sku AS LPAD(CAST(sku AS VARCHAR), 12, '0') PERSISTED;

CREATE INDEX idx_formatted_sku ON products(formatted_sku);

Edge Cases and Gotchas

Empty string vs NULL: Different databases handle empty strings differently. Always test NULL handling:

SELECT 
    LPAD('', 5, 'X') AS empty_string,
    LPAD(NULL, 5, 'X') AS null_value;

Multi-byte characters: When working with Unicode data, character length vs byte length matters:

-- PostgreSQL example with Unicode
SELECT 
    LPAD('日本', 5, 'X') AS unicode_padded,
    LENGTH(LPAD('日本', 5, 'X')) AS char_length;

Pad string longer than gap: When the pad string is longer than the remaining space, databases repeat it partially:

SELECT LPAD('test', 10, 'ABC') AS partial_repeat;

Result:

partial_repeat
--------------
ABCABCtest

Understanding these functions deeply allows you to generate properly formatted output directly from your queries, reducing post-processing requirements and ensuring consistency across your application’s data layer.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.