Redis Persistence: RDB and AOF

Key Insights

RDB provides compact point-in-time snapshots ideal for backups but risks losing data between intervals, while AOF logs every write operation for maximum durability at the cost of larger files and slower restarts.
Redis 4.0+ hybrid persistence combines RDB’s fast recovery with AOF’s durability by using snapshots as a base and replaying recent append-only logs during restart.
Production deployments should enable hybrid persistence with appendfsync everysec, implement automated backups to separate storage, and monitor persistence metrics to detect rewrite failures or fork issues.

Introduction to Redis Persistence

Redis is fundamentally an in-memory database, which makes it blazingly fast. But memory is volatile—when your Redis server restarts, everything vanishes unless you’ve configured persistence. This creates a classic trade-off: pure in-memory operation gives you maximum performance, while persistence adds durability at the cost of some overhead.

Redis offers two persistence mechanisms that you can use independently or together: RDB (Redis Database) snapshots and AOF (Append-Only File) logging. Understanding how each works, their performance characteristics, and when to use them is critical for production deployments where data loss is unacceptable.

RDB (Redis Database) Snapshots

RDB persistence creates point-in-time snapshots of your dataset at specified intervals. Redis forks a child process that writes the entire dataset to a binary dump file (typically dump.rdb) while the parent process continues serving requests. This copy-on-write mechanism means snapshots happen in the background without blocking client operations.

The snapshot frequency is controlled by save directives in your redis.conf:

# Save if at least 1 key changed in 900 seconds
save 900 1
# Save if at least 10 keys changed in 300 seconds
save 300 10
# Save if at least 10000 keys changed in 60 seconds
save 60 10000

These rules are evaluated continuously. If any condition is met, Redis triggers a background save. You can also trigger snapshots manually:

# Blocking save (don't use in production)
redis-cli SAVE

# Background save (recommended)
redis-cli BGSAVE

Here’s a Python script for automated RDB backups with rotation:

import redis
import time
import shutil
from datetime import datetime
from pathlib import Path

def backup_rdb(redis_client, backup_dir, retention_days=7):
    """Create RDB backup with timestamp and cleanup old backups"""
    backup_path = Path(backup_dir)
    backup_path.mkdir(exist_ok=True)
    
    # Trigger background save
    redis_client.bgsave()
    
    # Wait for save to complete
    while redis_client.info('persistence')['rdb_bgsave_in_progress']:
        time.sleep(1)
    
    # Get RDB file location
    config = redis_client.config_get('dir')
    rdb_dir = Path(config['dir'])
    rdb_file = rdb_dir / 'dump.rdb'
    
    # Copy with timestamp
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    backup_file = backup_path / f'dump_{timestamp}.rdb'
    shutil.copy2(rdb_file, backup_file)
    
    # Cleanup old backups
    cutoff = time.time() - (retention_days * 86400)
    for old_backup in backup_path.glob('dump_*.rdb'):
        if old_backup.stat().st_mtime < cutoff:
            old_backup.unlink()
    
    return backup_file

# Usage
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
backup_file = backup_rdb(r, '/var/backups/redis')
print(f"Backup created: {backup_file}")

RDB Advantages:

Compact single-file format, perfect for backups
Faster restarts compared to AOF (binary format loads quickly)
Minimal performance impact during normal operation
Good for disaster recovery and replication

RDB Disadvantages:

Data loss potential between snapshots (if Redis crashes, you lose changes since last save)
Fork operation can cause latency spikes on large datasets
Not suitable when you can’t afford to lose any data

AOF (Append-Only File)

AOF takes a different approach: it logs every write operation received by the server to an append-only file. When Redis restarts, it replays these operations to reconstruct the dataset. This provides much better durability than RDB snapshots.

The critical configuration is the fsync policy, which controls when data is actually written to disk:

# Enable AOF
appendonly yes
appendfilename "appendonly.aof"

# Fsync policy (choose one)
appendfsync always    # Fsync after every write (slowest, safest)
appendfsync everysec  # Fsync every second (good balance)
appendfsync no        # Let OS decide (fastest, least safe)

The everysec policy is the recommended default—it provides good durability (at most 1 second of data loss) with minimal performance impact.

AOF files grow continuously, so Redis includes an automatic rewrite mechanism that rebuilds the AOF from the current dataset, eliminating redundant operations:

# Trigger rewrite when AOF is 100% larger than after last rewrite
auto-aof-rewrite-percentage 100
# Don't rewrite if AOF is smaller than 64MB
auto-aof-rewrite-min-size 64mb

You can also trigger rewrites manually:

redis-cli BGREWRITEAOF

Here’s what an AOF file looks like:

*2
$6
SELECT
$1
0
*3
$3
SET
$5
mykey
$7
myvalue
*3
$4
INCR
$7
counter
$1
1

This is the Redis protocol format—human-readable but verbose. Each command is preserved exactly as received.

RDB vs AOF: Comparison and Use Cases

Aspect	RDB	AOF
Durability	Lose data between snapshots	Lose at most 1 sec (everysec)
File Size	Compact binary format	Larger, grows continuously
Recovery Time	Fast (binary load)	Slower (replay operations)
Performance Impact	Periodic fork latency	Continuous write overhead
Use Case	Backups, can tolerate data loss	Critical data, max durability

Here’s a benchmark comparing write performance:

import redis
import time

def benchmark_writes(redis_client, num_operations=100000):
    """Benchmark write performance"""
    start = time.time()
    
    for i in range(num_operations):
        redis_client.set(f'key:{i}', f'value:{i}')
    
    duration = time.time() - start
    ops_per_sec = num_operations / duration
    
    return {
        'duration': duration,
        'ops_per_sec': ops_per_sec
    }

# Test different configurations
configs = {
    'RDB only': {'appendonly': 'no'},
    'AOF everysec': {'appendonly': 'yes', 'appendfsync': 'everysec'},
    'AOF always': {'appendonly': 'yes', 'appendfsync': 'always'}
}

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

for name, config in configs.items():
    # Apply config
    for key, value in config.items():
        r.config_set(key, value)
    
    # Flush and benchmark
    r.flushall()
    result = benchmark_writes(r, 10000)
    
    print(f"{name}:")
    print(f"  Operations/sec: {result['ops_per_sec']:.2f}")
    print(f"  Duration: {result['duration']:.2f}s\n")

When to use RDB:

You can tolerate some data loss (5-15 minutes)
You need fast restarts
You want simple, compact backups
Caching layer where data can be regenerated

When to use AOF:

Data loss is unacceptable
You need audit trail of operations
Dataset changes frequently
Transactional or financial data

Hybrid Persistence (RDB + AOF)

Since Redis 4.0, you can enable both mechanisms simultaneously. During a restart, Redis loads the RDB snapshot first (fast) then replays the AOF log for changes since the snapshot. This gives you RDB’s fast recovery with AOF’s durability.

Configuration for hybrid mode:

# Enable both mechanisms
appendonly yes
save 900 1
save 300 10
save 60 10000

# Use RDB format for AOF rewrites (hybrid mode)
aof-use-rdb-preamble yes

# Standard AOF settings
appendfsync everysec
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

With aof-use-rdb-preamble yes, when Redis rewrites the AOF, it writes an RDB snapshot at the beginning followed by incremental AOF data. This dramatically reduces AOF file size and restart time.

Recovery process demonstration:

import redis
import subprocess
import time

def demonstrate_hybrid_recovery():
    """Show hybrid persistence recovery"""
    r = redis.Redis(host='localhost', port=6379, decode_responses=True)
    
    # Write test data
    print("Writing test data...")
    for i in range(1000):
        r.set(f'key:{i}', f'value:{i}')
    
    # Force save and AOF rewrite
    print("Creating RDB snapshot...")
    r.bgsave()
    time.sleep(2)
    
    print("Rewriting AOF...")
    r.bgrewriteaof()
    time.sleep(2)
    
    # Write more data (only in AOF)
    print("Writing additional data...")
    for i in range(1000, 1100):
        r.set(f'key:{i}', f'value:{i}')
    
    # Check persistence info
    info = r.info('persistence')
    print(f"\nPersistence status:")
    print(f"  RDB last save: {info['rdb_last_save_time']}")
    print(f"  AOF size: {info['aof_current_size']} bytes")
    print(f"  AOF base size: {info['aof_base_size']} bytes")
    
    return info

demonstrate_hybrid_recovery()

Best Practices and Production Recommendations

For production deployments, follow these guidelines:

Enable hybrid persistence with appendfsync everysec for the best balance of durability and performance
Monitor persistence health continuously
Implement off-server backups of RDB files
Size your server to handle fork operations (need 2x memory headroom)
Use separate disks for AOF writes when possible

Here’s a monitoring script:

import redis
from datetime import datetime

def check_persistence_health(redis_client):
    """Monitor Redis persistence status"""
    info = redis_client.info('persistence')
    
    issues = []
    
    # Check RDB status
    if info['rdb_last_bgsave_status'] != 'ok':
        issues.append(f"RDB save failed: {info['rdb_last_bgsave_status']}")
    
    # Check AOF status
    if info.get('aof_enabled') and info.get('aof_last_bgrewrite_status') != 'ok':
        issues.append(f"AOF rewrite failed: {info['aof_last_bgrewrite_status']}")
    
    # Check for ongoing operations
    if info['rdb_bgsave_in_progress']:
        issues.append("RDB save in progress")
    
    if info.get('aof_rewrite_in_progress'):
        issues.append("AOF rewrite in progress")
    
    # Check last save time
    last_save = datetime.fromtimestamp(info['rdb_last_save_time'])
    age_minutes = (datetime.now() - last_save).total_seconds() / 60
    
    if age_minutes > 60:
        issues.append(f"Last RDB save was {age_minutes:.0f} minutes ago")
    
    return {
        'healthy': len(issues) == 0,
        'issues': issues,
        'info': info
    }

# Usage in monitoring system
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
health = check_persistence_health(r)

if not health['healthy']:
    print("ALERT: Persistence issues detected:")
    for issue in health['issues']:
        print(f"  - {issue}")
else:
    print("Persistence healthy")

For critical production systems, configure hybrid persistence, implement automated backups to separate storage (S3, network storage), and monitor both RDB and AOF health metrics. Test your recovery procedures regularly—persistence configurations are worthless if you can’t actually restore from them when disaster strikes.