Run-Length Encoding: Simple Compression

Key Insights

Run-length encoding replaces consecutive identical elements with count-value pairs, making it extremely effective for data with long runs but potentially counterproductive for random or varied data
RLE’s simplicity (O(n) time, O(1) space for streaming) makes it ideal as a preprocessing step before more sophisticated compression algorithms like Huffman or LZ-based methods
Despite being one of the oldest compression techniques, RLE remains relevant in modern applications including bitmap images, game development, and real-time data streaming where computational overhead matters

What is Run-Length Encoding?

Run-length encoding is one of the simplest compression algorithms you’ll encounter. The concept is straightforward: instead of storing repeated consecutive elements individually, you store a count followed by the element itself. The string “AAAAAAA” becomes “7A” — seven bytes reduced to two.

RLE dates back to the 1960s when bandwidth and storage were precious commodities. It was among the first compression schemes used in television signals and remains embedded in formats you use today. While modern compression algorithms like gzip or zstd offer superior ratios for general-purpose data, RLE’s simplicity and speed keep it relevant for specific use cases.

In the compression landscape, RLE sits at the “fast and simple” end of the spectrum. It won’t compete with dictionary-based methods for text compression, but when your data has natural runs — think binary images, sensor readings, or game tile maps — RLE delivers solid compression with minimal computational overhead.

How RLE Works

The encoding process scans input sequentially, counting consecutive identical elements. When the element changes, you emit the count and value, then reset your counter.

Consider encoding “AAABBBCCCC”:

Read ‘A’, count = 1
Read ‘A’, count = 2
Read ‘A’, count = 3
Read ‘B’ (different), emit “3A”, reset count = 1
Read ‘B’, count = 2
Read ‘B’, count = 3
Read ‘C’ (different), emit “3B”, reset count = 1
Continue until end…

Result: “3A3B4C” — 10 characters compressed to 6.

Here’s a basic implementation:

def rle_encode(data: str) -> str:
    if not data:
        return ""
    
    encoded = []
    count = 1
    current = data[0]
    
    for char in data[1:]:
        if char == current:
            count += 1
        else:
            encoded.append(f"{count}{current}")
            current = char
            count = 1
    
    # Don't forget the last run
    encoded.append(f"{count}{current}")
    return "".join(encoded)


def rle_decode(data: str) -> str:
    if not data:
        return ""
    
    decoded = []
    i = 0
    
    while i < len(data):
        # Extract the count (could be multiple digits)
        count_str = ""
        while i < len(data) and data[i].isdigit():
            count_str += data[i]
            i += 1
        
        count = int(count_str)
        char = data[i]
        decoded.append(char * count)
        i += 1
    
    return "".join(decoded)


# Test it
original = "AAABBBCCCCDDDDDDDD"
encoded = rle_encode(original)
decoded = rle_decode(encoded)

print(f"Original: {original} (length: {len(original)})")
print(f"Encoded:  {encoded} (length: {len(encoded)})")
print(f"Decoded:  {decoded}")
print(f"Compression ratio: {len(original) / len(encoded):.2f}x")

Implementation Variations

The naive implementation above has problems. What happens with “ABCD”? It becomes “1A1B1C1D” — we’ve doubled the size. Real-world RLE implementations handle these edge cases.

Count-first vs. Value-first: Some formats store the value before the count. PackBits (used in TIFF) uses a signed byte where negative values indicate runs and positive values indicate literal sequences.

Handling single characters: One approach uses escape sequences. Another stores runs only when count exceeds a threshold:

def rle_encode_optimized(data: bytes) -> bytes:
    """
    Optimized RLE for binary data.
    Uses a marker byte (0xFF) to indicate runs.
    Format: For runs >= 4, emit [0xFF, count, value]
    For shorter sequences, emit literals.
    """
    if not data:
        return b""
    
    MARKER = 0xFF
    MIN_RUN = 4
    result = bytearray()
    i = 0
    
    while i < len(data):
        # Count the run length
        run_start = i
        current = data[i]
        
        while i < len(data) and data[i] == current and (i - run_start) < 255:
            i += 1
        
        run_length = i - run_start
        
        if run_length >= MIN_RUN:
            # Encode as a run
            result.append(MARKER)
            result.append(run_length)
            result.append(current)
        else:
            # Emit literals, escaping any marker bytes
            for j in range(run_start, i):
                if data[j] == MARKER:
                    result.append(MARKER)
                    result.append(1)
                    result.append(MARKER)
                else:
                    result.append(data[j])
    
    return bytes(result)


def rle_decode_optimized(data: bytes) -> bytes:
    MARKER = 0xFF
    result = bytearray()
    i = 0
    
    while i < len(data):
        if data[i] == MARKER:
            count = data[i + 1]
            value = data[i + 2]
            result.extend([value] * count)
            i += 3
        else:
            result.append(data[i])
            i += 1
    
    return bytes(result)

The trade-off is clear: the optimized version adds complexity but avoids expansion for non-repetitive data. Choose based on your data characteristics.

Best and Worst Case Scenarios

RLE’s effectiveness depends entirely on input characteristics. Let’s quantify this:

import random
import string


def compression_ratio(original: bytes, compressed: bytes) -> float:
    if len(compressed) == 0:
        return float('inf')
    return len(original) / len(compressed)


def analyze_rle_effectiveness():
    test_cases = {
        "Long runs": bytes([0] * 100 + [255] * 100 + [0] * 100),
        "Alternating": bytes([0, 255] * 150),
        "Random bytes": bytes(random.randint(0, 254) for _ in range(300)),
        "Bitmap-like": bytes([0] * 50 + [255] * 10 + [0] * 50) * 3,
        "Gradient": bytes(range(256)) + bytes(range(255, -1, -1)),
    }
    
    print(f"{'Data Type':<20} {'Original':<10} {'Compressed':<12} {'Ratio':<10}")
    print("-" * 55)
    
    for name, data in test_cases.items():
        compressed = rle_encode_optimized(data)
        ratio = compression_ratio(data, compressed)
        status = "✓" if ratio > 1 else "✗"
        print(f"{name:<20} {len(data):<10} {len(compressed):<12} {ratio:.2f}x {status}")


analyze_rle_effectiveness()

Best cases (compression ratio > 2x):

Binary images with large solid regions
Sensor data with stable readings
Sparse matrices stored as arrays
Game tile maps with repeated tiles

Worst cases (expansion):

Compressed data (already random-looking)
Natural language text
Encrypted data
High-entropy audio/video

The rule of thumb: if your data has runs averaging 3+ elements, RLE helps. Below that, you’re likely expanding.

Real-World Applications

Despite its age, RLE appears in production systems everywhere:

Bitmap images (BMP, PCX): The BMP format supports RLE compression for 4-bit and 8-bit images. PCX files use RLE exclusively. These formats persist in legacy systems and embedded applications.

Fax machines: The ITU T.4 standard uses a modified RLE scheme. Fax transmissions encode runs of white and black pixels, which works brilliantly for text documents.

JPEG preprocessing: JPEG applies RLE to the quantized DCT coefficients after zigzag ordering. The zeros cluster together, making RLE effective before final entropy coding.

Game development: Tile-based games compress level data with RLE. A platformer level might be 90% empty space — perfect for run-length encoding.

Database storage: Column-oriented databases use RLE for sorted columns. A column of country codes sorted alphabetically creates long runs of identical values.

Combining RLE with Other Techniques

RLE’s real power emerges when combined with other algorithms. The Burrows-Wheeler Transform (BWT) rearranges data to cluster similar characters together, creating runs that RLE can exploit:

def burrows_wheeler_transform(data: bytes) -> tuple[bytes, int]:
    """Simple BWT implementation."""
    if not data:
        return b"", 0
    
    # Add end-of-string marker
    s = data + bytes([0])
    
    # Create all rotations and sort
    rotations = sorted(s[i:] + s[:i] for i in range(len(s)))
    
    # Extract last column
    last_column = bytes(r[-1] for r in rotations)
    
    # Find original string position
    original_index = rotations.index(s)
    
    return last_column, original_index


def inverse_bwt(data: bytes, index: int) -> bytes:
    """Reverse the BWT."""
    n = len(data)
    
    # Build the transformation vector
    sorted_data = sorted(range(n), key=lambda i: data[i])
    
    # Follow the chain
    result = bytearray()
    current = index
    for _ in range(n - 1):  # Exclude the marker
        current = sorted_data[current]
        result.append(data[current])
    
    return bytes(result)


def bwt_rle_compress(data: bytes) -> tuple[bytes, int]:
    """Combine BWT with RLE for better compression."""
    transformed, index = burrows_wheeler_transform(data)
    compressed = rle_encode_optimized(transformed)
    return compressed, index


# Demonstrate the improvement
test_data = b"banana" * 20

# RLE alone
rle_only = rle_encode_optimized(test_data)

# BWT + RLE
bwt_rle, idx = bwt_rle_compress(test_data)

print(f"Original size: {len(test_data)}")
print(f"RLE only: {len(rle_only)} ({len(test_data)/len(rle_only):.2f}x)")
print(f"BWT + RLE: {len(bwt_rle)} ({len(test_data)/len(bwt_rle):.2f}x)")

This BWT + RLE pipeline forms the core of bzip2 compression. The BWT clusters similar bytes, RLE handles the resulting runs, and Huffman coding finishes the job.

Conclusion

Run-length encoding won’t win compression benchmarks against modern algorithms, but that’s not the point. Its value lies in simplicity: O(n) encoding, O(n) decoding, minimal memory overhead, and trivial implementation.

Reach for RLE when you have:

Data with natural runs (images, sensor readings, sparse data)
Strict latency requirements where complex algorithms aren’t viable
A preprocessing step before another compression stage
Embedded systems with limited computational resources

Skip RLE when dealing with text, already-compressed data, or anything approaching random distribution. The algorithm will expand your data rather than compress it.

The best compression strategy often combines techniques. RLE as a preprocessing step before entropy coding leverages each algorithm’s strengths. Understanding when simple solutions work — and when they don’t — separates effective engineers from those who reflexively reach for complex tools.