Zstandard: Modern Compression Algorithm

Key Insights

Zstandard delivers compression ratios comparable to zlib while operating at speeds closer to LZ4, making it the default choice for most modern compression needs
Dictionary compression can improve ratios by 5-10x for small, repetitive data like logs, JSON payloads, and protocol buffers
Compression levels 1-4 suit real-time applications, 5-9 work for general storage, and 10+ should be reserved for archival where CPU time is cheap

Introduction to Zstandard

Zstandard (zstd) emerged from Facebook in 2016, created by Yann Collet—the same engineer behind LZ4. The motivation was straightforward: existing compression algorithms forced an uncomfortable trade-off between speed and compression ratio. Gzip offered decent ratios but crawled on large datasets. LZ4 flew but left significant compression on the table. Brotli optimized for web delivery but wasn’t designed for general-purpose use.

Zstandard occupies the sweet spot. It matches or exceeds gzip’s compression ratio while operating at speeds that approach LZ4. This isn’t marketing fluff—it’s the reason zstd has become the default compression algorithm in the Linux kernel, replaced gzip in Arch Linux packages, and powers compression at Meta, Cloudflare, and countless other infrastructure-heavy organizations.

How Zstandard Works

Zstandard combines three core techniques: dictionary coding (LZ77-style matching), entropy encoding using Finite State Entropy (FSE) and Huffman coding, and an efficient block structure that enables streaming.

The algorithm processes data in blocks, identifying repeated sequences and replacing them with back-references. These references and literal data then pass through FSE encoding—a technique Collet developed that outperforms traditional Huffman coding while remaining computationally efficient.

Compression levels range from 1 to 22, with level 3 as the default. Lower levels prioritize speed; higher levels spend more CPU cycles searching for better matches. The relationship isn’t linear—level 19 might take 10x longer than level 10 while only improving compression by 5%.

import zstandard as zstd

# Basic compression at default level (3)
compressor = zstd.ZstdCompressor()
original_data = b"Your data here. " * 1000

compressed = compressor.compress(original_data)
print(f"Original: {len(original_data)} bytes")
print(f"Compressed: {len(compressed)} bytes")
print(f"Ratio: {len(original_data) / len(compressed):.2f}x")

# Decompression
decompressor = zstd.ZstdDecompressor()
restored = decompressor.decompress(compressed)
assert restored == original_data

# Higher compression level for archival
archival_compressor = zstd.ZstdCompressor(level=19)
archival_compressed = archival_compressor.compress(original_data)
print(f"Archival compressed: {len(archival_compressed)} bytes")

Performance Benchmarks

Raw numbers matter more than marketing. Here’s a benchmark comparing common compression algorithms on a 100MB sample of mixed content (logs, JSON, and binary data):

import zstandard as zstd
import gzip
import lzma
import time
import os

def benchmark_compression(data: bytes, name: str, compress_func, decompress_func):
    # Compression
    start = time.perf_counter()
    compressed = compress_func(data)
    compress_time = time.perf_counter() - start
    
    # Decompression
    start = time.perf_counter()
    decompressed = decompress_func(compressed)
    decompress_time = time.perf_counter() - start
    
    ratio = len(data) / len(compressed)
    compress_speed = len(data) / compress_time / 1024 / 1024  # MB/s
    decompress_speed = len(data) / decompress_time / 1024 / 1024
    
    print(f"{name:12} | Ratio: {ratio:5.2f}x | "
          f"Compress: {compress_speed:7.1f} MB/s | "
          f"Decompress: {decompress_speed:7.1f} MB/s")

# Sample data - in practice, use your actual workload
sample_data = open("/path/to/sample/file", "rb").read()

# Zstandard at different levels
for level in [1, 3, 9, 19]:
    cctx = zstd.ZstdCompressor(level=level)
    dctx = zstd.ZstdDecompressor()
    benchmark_compression(
        sample_data, f"zstd-{level}",
        cctx.compress, dctx.decompress
    )

# Gzip
benchmark_compression(
    sample_data, "gzip-6",
    lambda d: gzip.compress(d, compresslevel=6),
    gzip.decompress
)

# LZMA/XZ
benchmark_compression(
    sample_data, "lzma",
    lzma.compress,
    lzma.decompress
)

Typical results on modern hardware show zstd-3 compressing at 300-500 MB/s with ratios around 3-4x on mixed data. Gzip-6 achieves similar ratios but at 30-50 MB/s. LZMA wins on ratio (5-6x) but crawls at 5-10 MB/s.

Zstandard excels when you need both reasonable compression and throughput—log aggregation, database backups, network transfer, and container images. LZMA wins for cold archival where decompression is rare. LZ4 wins for in-memory caching where nanoseconds matter.

Dictionary Compression

Dictionary compression is Zstandard’s secret weapon for small data. When compressing individual log lines, JSON API responses, or protocol buffer messages, standard compression fails—there’s not enough data to identify patterns.

Dictionaries solve this by pre-training on representative samples. The compressor then references this shared dictionary, dramatically improving ratios on small payloads.

import zstandard as zstd
import json

# Sample training data - use real examples from your domain
training_samples = [
    json.dumps({"user_id": 12345, "action": "login", "timestamp": "2024-01-15T10:30:00Z", "metadata": {"ip": "192.168.1.1", "user_agent": "Mozilla/5.0"}}),
    json.dumps({"user_id": 67890, "action": "purchase", "timestamp": "2024-01-15T10:31:00Z", "metadata": {"ip": "10.0.0.1", "user_agent": "Chrome/120"}}),
    json.dumps({"user_id": 11111, "action": "logout", "timestamp": "2024-01-15T10:32:00Z", "metadata": {"ip": "172.16.0.1", "user_agent": "Safari/17"}}),
    # Add hundreds more representative samples
] * 100  # Repeat for training purposes

# Train the dictionary
dict_data = zstd.train_dictionary(
    dict_size=32768,  # 32KB dictionary
    samples=[s.encode() for s in training_samples]
)

# Save dictionary for distribution to all nodes
with open("api_response.dict", "wb") as f:
    f.write(dict_data.as_bytes())

# Compress with dictionary
compressor = zstd.ZstdCompressor(dict_data=dict_data)
decompressor = zstd.ZstdDecompressor(dict_data=dict_data)

# Test on new data
test_payload = json.dumps({
    "user_id": 99999,
    "action": "view",
    "timestamp": "2024-01-15T10:35:00Z",
    "metadata": {"ip": "8.8.8.8", "user_agent": "Firefox/121"}
}).encode()

compressed_with_dict = compressor.compress(test_payload)
compressed_without = zstd.compress(test_payload)

print(f"Original: {len(test_payload)} bytes")
print(f"Without dictionary: {len(compressed_without)} bytes")
print(f"With dictionary: {len(compressed_with_dict)} bytes")
print(f"Dictionary improvement: {len(compressed_without) / len(compressed_with_dict):.2f}x")

For small payloads under 1KB, dictionary compression routinely achieves 5-10x better ratios than standard compression. The catch: both compressor and decompressor must share the same dictionary. This works well for controlled environments like microservices or log pipelines, less well for arbitrary clients.

Streaming and Real-World Integration

Real workloads involve files that don’t fit in memory and streams that never end. Zstandard handles both elegantly:

import zstandard as zstd

def compress_file_streaming(input_path: str, output_path: str, level: int = 3):
    """Compress a file using streaming, suitable for files of any size."""
    compressor = zstd.ZstdCompressor(level=level)
    
    with open(input_path, "rb") as input_file:
        with open(output_path, "wb") as output_file:
            # Stream compressor handles chunking internally
            with compressor.stream_writer(output_file) as writer:
                while chunk := input_file.read(16384):  # 16KB chunks
                    writer.write(chunk)

def decompress_file_streaming(input_path: str, output_path: str):
    """Decompress a file using streaming."""
    decompressor = zstd.ZstdDecompressor()
    
    with open(input_path, "rb") as input_file:
        with open(output_path, "wb") as output_file:
            with decompressor.stream_writer(output_file) as writer:
                while chunk := input_file.read(16384):
                    writer.write(chunk)

# For network protocols, use stream_reader for pull-based APIs
def process_compressed_stream(compressed_file):
    """Read and process compressed data without full decompression."""
    decompressor = zstd.ZstdDecompressor()
    
    with open(compressed_file, "rb") as f:
        with decompressor.stream_reader(f) as reader:
            while chunk := reader.read(8192):
                # Process chunk immediately
                process_chunk(chunk)

Integration points abound. Nginx supports zstd via ngx_http_zstd_module. PostgreSQL 16 added native zstd support for TOAST compression. Kafka supports zstd as a message compression codec. Most backup tools—restic, borgbackup, and Bacula—support zstd natively.

Practical Implementation Considerations

Choosing the right compression level depends on your constraints:

# Real-time log compression - speed matters
zstd -1 -T0 access.log -o access.log.zst

# General file storage - balanced
zstd -3 database_dump.sql -o database_dump.sql.zst

# Archival storage - ratio matters, time doesn't
zstd -19 --long cold_archive.tar -o cold_archive.tar.zst

# Maximum compression with long-range matching
zstd -19 --long=31 --ultra huge_dataset.bin -o huge_dataset.bin.zst

# Parallel compression using all cores
zstd -T0 -3 large_file.bin -o large_file.bin.zst

Memory requirements scale with compression level. Level 3 uses roughly 1MB per thread. Level 19 can require 128MB or more. The --long flag enables long-range matching windows, further increasing memory but improving ratios on repetitive data like virtual machine images.

For multi-threaded applications, create one compressor per thread or use the library’s built-in threading:

import zstandard as zstd

# Multi-threaded compression
compressor = zstd.ZstdCompressor(level=3, threads=-1)  # -1 = auto-detect cores
compressed = compressor.compress(large_data)

# Thread-safe: create compressor once, use from multiple threads
# The ZstdCompressor object is thread-safe for compression operations

Conclusion

Zstandard has earned its position as the default compression algorithm for modern infrastructure. It obsoletes gzip for nearly every use case, offering better ratios at higher speeds. Dictionary compression unlocks efficiency for small-data workloads that previously couldn’t benefit from compression at all.

Choose Zstandard when you need general-purpose compression with good performance. Choose LZ4 when decompression speed is critical and you can sacrifice ratio. Choose LZMA/XZ for cold archival where you’ll compress once and rarely decompress.

The adoption trajectory is clear: Linux kernel, systemd, package managers, databases, and backup tools have all moved to zstd. If you’re still defaulting to gzip, you’re leaving performance on the table.