NoSQL Key-Value Stores: Redis and DynamoDB

Key-value stores represent the simplest NoSQL data model: a distributed hash table where each unique key maps to a value. Unlike relational databases with rigid schemas and complex join operations,...

Key Insights

  • Redis excels at sub-millisecond latency for caching and real-time features but requires careful memory management and persistence strategies, while DynamoDB provides durable, auto-scaling storage with single-digit millisecond latency at virtually unlimited scale.
  • The choice between Redis and DynamoDB isn’t binary—most production systems benefit from using Redis as a cache layer in front of DynamoDB or other persistent stores, combining Redis’s speed with DynamoDB’s durability and scalability.
  • Understanding partition key design in DynamoDB and data structure selection in Redis are the critical factors that determine whether your application will scale efficiently or hit performance walls at production load.

Introduction to Key-Value Stores

Key-value stores represent the simplest NoSQL data model: a distributed hash table where each unique key maps to a value. Unlike relational databases with rigid schemas and complex join operations, key-value stores trade query flexibility for predictable performance and horizontal scalability.

You should reach for key-value stores when you need fast lookups by primary key, can denormalize your data model, and don’t require complex queries across multiple attributes. Session management, user profiles, product catalogs, and real-time leaderboards are ideal use cases.

Redis and DynamoDB occupy different positions in the key-value ecosystem. Redis is an in-memory data structure server offering sub-millisecond latency, making it perfect for caching, session stores, and real-time analytics. DynamoDB is AWS’s fully-managed, persistent key-value and document database designed for applications requiring consistent single-digit millisecond performance at any scale.

Redis Fundamentals

Redis stores data entirely in memory, which explains its exceptional speed. While it supports persistence through snapshots (RDB) and append-only files (AOF), Redis is primarily optimized for scenarios where you can rebuild data from source systems if needed.

What sets Redis apart from simple key-value stores is its rich data structures. Beyond basic strings, Redis natively supports hashes, lists, sets, sorted sets, bitmaps, and streams. Choosing the right data structure dramatically impacts both performance and memory efficiency.

Here’s how to perform basic operations in Redis:

import redis

# Connect to Redis
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

# Basic string operations
r.set('user:1000:name', 'Alice Johnson')
r.set('user:1000:email', 'alice@example.com')
name = r.get('user:1000:name')  # Returns 'Alice Johnson'

# Set with expiration (TTL in seconds)
r.setex('session:abc123', 3600, 'user_data_here')

# Delete key
r.delete('user:1000:email')

For storing objects, Redis hashes are more memory-efficient than serializing JSON into strings:

# Store user object as hash
r.hset('user:1000', mapping={
    'name': 'Alice Johnson',
    'email': 'alice@example.com',
    'signup_date': '2024-01-15',
    'credits': '100'
})

# Get specific field
email = r.hget('user:1000', 'email')

# Get all fields
user = r.hgetall('user:1000')

# Increment numeric field atomically
r.hincrby('user:1000', 'credits', 50)

Sorted sets are perfect for leaderboards and ranking systems:

# Add players to leaderboard with scores
r.zadd('leaderboard:daily', {
    'player:alice': 2500,
    'player:bob': 1800,
    'player:charlie': 3200
})

# Get top 10 players (highest scores)
top_players = r.zrevrange('leaderboard:daily', 0, 9, withscores=True)

# Get player's rank (0-indexed)
rank = r.zrevrank('leaderboard:daily', 'player:alice')

# Get players in score range
mid_tier = r.zrangebyscore('leaderboard:daily', 2000, 3000)

DynamoDB Fundamentals

DynamoDB distributes data across multiple partitions based on the partition key. Every table requires a partition key (also called hash key), and optionally a sort key (range key) for composite primary keys. Understanding this design is crucial—poor partition key selection leads to hot partitions and throttling.

The partition key determines which partition stores your data, while the sort key enables range queries within a partition. For example, a messaging app might use user_id as partition key and timestamp as sort key to efficiently query a user’s recent messages.

Creating a DynamoDB table with Python’s boto3:

import boto3

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')

# Create table with composite primary key
table = dynamodb.create_table(
    TableName='UserMessages',
    KeySchema=[
        {'AttributeName': 'user_id', 'KeyType': 'HASH'},   # Partition key
        {'AttributeName': 'timestamp', 'KeyType': 'RANGE'}  # Sort key
    ],
    AttributeDefinitions=[
        {'AttributeName': 'user_id', 'AttributeType': 'S'},
        {'AttributeName': 'timestamp', 'AttributeType': 'N'}
    ],
    BillingMode='PAY_PER_REQUEST'  # On-demand pricing
)

Basic CRUD operations in DynamoDB:

table = dynamodb.Table('UserMessages')

# Put item
table.put_item(
    Item={
        'user_id': 'user_123',
        'timestamp': 1705334400,
        'message': 'Hello world',
        'sender': 'user_456',
        'read': False
    }
)

# Get specific item
response = table.get_item(
    Key={
        'user_id': 'user_123',
        'timestamp': 1705334400
    }
)
item = response.get('Item')

# Query all messages for user (sorted by timestamp)
response = table.query(
    KeyConditionExpression='user_id = :uid AND timestamp > :ts',
    ExpressionAttributeValues={
        ':uid': 'user_123',
        ':ts': 1705248000
    }
)
messages = response['Items']

For bulk operations, use batch APIs to reduce network round trips:

# Batch write (up to 25 items)
with table.batch_writer() as batch:
    for i in range(100):
        batch.put_item(
            Item={
                'user_id': f'user_{i}',
                'timestamp': 1705334400 + i,
                'message': f'Message {i}'
            }
        )

# Batch get (up to 100 items)
response = dynamodb.batch_get_item(
    RequestItems={
        'UserMessages': {
            'Keys': [
                {'user_id': 'user_123', 'timestamp': 1705334400},
                {'user_id': 'user_123', 'timestamp': 1705334500}
            ]
        }
    }
)

Performance and Scalability Patterns

Redis delivers microsecond latency for simple operations because everything lives in memory. A single Redis instance can handle 100,000+ operations per second. For higher throughput, Redis Cluster shards data across multiple nodes using hash slots.

DynamoDB provides consistent single-digit millisecond latency with virtually unlimited scalability. It automatically partitions and repartitions data as your table grows. With on-demand billing, DynamoDB handles traffic spikes without capacity planning.

Redis pipelining batches multiple commands into a single network round trip:

# Without pipelining: 1000 network round trips
for i in range(1000):
    r.set(f'key:{i}', f'value:{i}')

# With pipelining: 1 network round trip
pipe = r.pipeline()
for i in range(1000):
    pipe.set(f'key:{i}', f'value:{i}')
pipe.execute()

DynamoDB Global Secondary Indexes (GSI) enable queries on non-key attributes:

# Create GSI for querying by sender
table = dynamodb.create_table(
    TableName='UserMessages',
    KeySchema=[
        {'AttributeName': 'user_id', 'KeyType': 'HASH'},
        {'AttributeName': 'timestamp', 'KeyType': 'RANGE'}
    ],
    AttributeDefinitions=[
        {'AttributeName': 'user_id', 'AttributeType': 'S'},
        {'AttributeName': 'timestamp', 'AttributeType': 'N'},
        {'AttributeName': 'sender', 'AttributeType': 'S'}
    ],
    GlobalSecondaryIndexes=[{
        'IndexName': 'SenderIndex',
        'KeySchema': [
            {'AttributeName': 'sender', 'KeyType': 'HASH'},
            {'AttributeName': 'timestamp', 'KeyType': 'RANGE'}
        ],
        'Projection': {'ProjectionType': 'ALL'}
    }],
    BillingMode='PAY_PER_REQUEST'
)

# Query using GSI
response = table.query(
    IndexName='SenderIndex',
    KeyConditionExpression='sender = :s',
    ExpressionAttributeValues={':s': 'user_456'}
)

Caching Strategies and Use Cases

The cache-aside pattern with Redis is the most common architecture: your application checks Redis first, and on cache miss, fetches from the primary database and populates the cache.

def get_user_profile(user_id):
    cache_key = f'user:profile:{user_id}'
    
    # Try cache first
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Cache miss - fetch from DynamoDB
    response = table.get_item(Key={'user_id': user_id})
    profile = response.get('Item')
    
    # Populate cache with 1-hour TTL
    if profile:
        r.setex(cache_key, 3600, json.dumps(profile))
    
    return profile

Both systems support TTL, but with different semantics. Redis TTL is precise and memory-efficient—expired keys are removed immediately. DynamoDB TTL is eventual—items expire within 48 hours, useful for compliance but not real-time cache invalidation.

# Redis TTL (exact)
r.setex('session:xyz', 1800, 'session_data')

# DynamoDB TTL (eventual, within 48 hours)
table.put_item(
    Item={
        'user_id': 'user_123',
        'timestamp': 1705334400,
        'data': 'temporary_data',
        'ttl': int(time.time()) + 86400  # Expire in 24 hours
    }
)

Cost and Operational Considerations

Redis requires you to provision memory capacity. A 32GB Redis instance on AWS ElastiCache costs around $200-300/month. You’re paying for peak capacity even during low-traffic periods. Redis Cluster adds complexity but enables horizontal scaling.

DynamoDB’s on-demand pricing charges per request: $1.25 per million writes, $0.25 per million reads. For predictable workloads, provisioned capacity is cheaper. A table serving 100 reads/sec and 20 writes/sec costs roughly $50-70/month with provisioned capacity.

Operational overhead differs dramatically. Managed Redis (ElastiCache, Redis Cloud) handles failover and backups but requires capacity planning and occasional scaling operations. DynamoDB is truly serverless—no servers to patch, no capacity planning, automatic backups.

For persistence, Redis offers RDB snapshots (point-in-time) or AOF (append-only file). DynamoDB provides continuous backups and point-in-time recovery up to 35 days. DynamoDB’s durability guarantees are stronger.

Choosing the Right Store

Use Redis when you need sub-millisecond latency, can tolerate data loss, and have predictable memory requirements. Ideal for: session stores, real-time analytics, pub/sub messaging, rate limiting, and caching.

Use DynamoDB when you need durable storage, unpredictable scaling, and single-digit millisecond latency is sufficient. Ideal for: user profiles, product catalogs, IoT data, mobile app backends, and gaming session state.

Most production architectures use both: Redis for hot data and caching, DynamoDB for durable storage. This hybrid approach combines Redis’s speed with DynamoDB’s scalability and durability. Start with DynamoDB as your source of truth, add Redis caching where latency requirements demand it.

The key question isn’t “Redis or DynamoDB?” but rather “What data needs microsecond access, and what can tolerate milliseconds?” Design your data model around access patterns, choose partition keys that distribute load evenly, and use Redis to accelerate the critical path while DynamoDB handles the long tail.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.