Design a Leaderboard System: Real-Time Rankings
Leaderboards look deceptively simple. Store some scores, sort them, show the top N. A junior developer could build one in an afternoon. But that afternoon project collapses the moment you need to...
Key Insights
- Redis sorted sets provide O(log N) ranking operations, making them the backbone of virtually every production leaderboard system—but you need to understand their memory characteristics and sharding limitations.
- Separating your write path from your read path isn’t optional at scale; score submissions should flow through a queue while leaderboard queries hit cached snapshots.
- Real-time doesn’t mean instant for every user—smart batching and tiered update strategies let you deliver perceived real-time performance without melting your infrastructure.
The Leaderboard Challenge
Leaderboards look deceptively simple. Store some scores, sort them, show the top N. A junior developer could build one in an afternoon. But that afternoon project collapses the moment you need to answer “what’s my rank?” for millions of users simultaneously while thousands of scores update every second.
The core challenges are threefold: ranking consistency (everyone sees the same relative ordering), update latency (score changes reflect quickly), and query performance (fetching ranks can’t become a bottleneck). Gaming platforms, fitness apps like Strava, sales dashboards, and competitive coding sites all face these same constraints.
Let’s design a system that handles these challenges properly.
Core Data Structures for Rankings
Sorted sets are the foundation of leaderboard design. Unlike regular sets, sorted sets maintain elements ordered by a score value, giving you O(log N) insertions and O(log N) rank lookups. Redis implements these as skip lists internally, which provides excellent performance characteristics for leaderboard operations.
Here’s why sorted sets win over alternatives:
- B-trees/databases: Require expensive ORDER BY queries for ranking
- Heaps: Great for top-K but terrible for arbitrary rank lookups
- Arrays: O(N) insertions kill you at scale
The basic Redis operations you’ll live with:
import redis
class LeaderboardService:
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.leaderboard_key = "leaderboard:global"
def submit_score(self, user_id: str, score: float) -> None:
# ZADD with GT flag: only update if new score is greater
self.redis.zadd(self.leaderboard_key, {user_id: score}, gt=True)
def get_rank(self, user_id: str) -> int | None:
# ZREVRANK returns 0-indexed rank (highest score = rank 0)
rank = self.redis.zrevrank(self.leaderboard_key, user_id)
return rank + 1 if rank is not None else None
def get_top_n(self, n: int = 100) -> list[tuple[str, float]]:
# ZREVRANGE with scores, returns [(user_id, score), ...]
return self.redis.zrevrange(
self.leaderboard_key, 0, n - 1, withscores=True
)
def get_score(self, user_id: str) -> float | None:
return self.redis.zscore(self.leaderboard_key, user_id)
A critical design decision: dense vs. sparse rankings. Dense ranking means if three players tie for first, the next player is rank 2. Sparse (or competition) ranking would make them rank 4. Redis gives you sparse ranking by default. If you need dense rankings, you’ll need to compute them application-side or maintain a separate mapping.
System Architecture Overview
At production scale, you can’t just point your API servers directly at Redis. Here’s the component breakdown:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Clients │────▶│ API Layer │────▶│ Score Queue │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
│ ▼ ▼
│ ┌─────────────┐ ┌─────────────┐
│ │ Read Cache │ │Score Workers│
│ │ (Redis) │ └─────────────┘
│ └─────────────┘ │
│ ▲ ▼
│ │ ┌─────────────┐
└───────────────────┴────────────│Primary Redis│
WebSocket Updates │(Sorted Sets)│
└─────────────┘
│
▼
┌─────────────┐
│ PostgreSQL │
│ (Archival) │
└─────────────┘
The write path and read path separation is crucial. Score submissions go through a queue (Kafka, SQS, or Redis Streams) to handle burst traffic. Workers process these asynchronously, updating the primary sorted set. Read requests hit a cache layer that’s refreshed on a schedule or via pub/sub notifications.
from fastapi import FastAPI, WebSocket
from pydantic import BaseModel
import asyncio
app = FastAPI()
class ScoreSubmission(BaseModel):
user_id: str
score: float
game_id: str
@app.post("/scores")
async def submit_score(submission: ScoreSubmission):
# Don't update Redis directly—queue it
await score_queue.enqueue({
"user_id": submission.user_id,
"score": submission.score,
"game_id": submission.game_id,
"timestamp": time.time()
})
return {"status": "accepted"}
@app.get("/leaderboard")
async def get_leaderboard(limit: int = 100, offset: int = 0):
# Hit the read cache, not primary storage
cached = await read_cache.get(f"leaderboard:top:{limit}:{offset}")
if cached:
return cached
# Cache miss: fetch and populate
results = await leaderboard_service.get_range(offset, offset + limit)
await read_cache.set(
f"leaderboard:top:{limit}:{offset}",
results,
ttl=5 # 5-second cache
)
return results
Handling Real-Time Updates
“Real-time” is a spectrum. For a casual mobile game, updating ranks every 5 seconds is fine. For a live esports event, you need sub-second updates. Your architecture should support both without rebuilding.
WebSockets are the standard approach for pushing updates to clients:
from fastapi import WebSocket, WebSocketDisconnect
from typing import Set
import json
class LeaderboardBroadcaster:
def __init__(self):
self.connections: Set[WebSocket] = set()
self.update_buffer: list = []
self.buffer_interval = 0.1 # 100ms batching
async def connect(self, websocket: WebSocket):
await websocket.accept()
self.connections.add(websocket)
# Send current top 10 immediately on connect
top_10 = await leaderboard_service.get_top_n(10)
await websocket.send_json({"type": "snapshot", "data": top_10})
def disconnect(self, websocket: WebSocket):
self.connections.discard(websocket)
async def queue_update(self, user_id: str, new_rank: int, score: float):
self.update_buffer.append({
"user_id": user_id,
"rank": new_rank,
"score": score
})
async def flush_updates(self):
"""Called periodically to batch-send updates"""
if not self.update_buffer:
return
# Deduplicate: keep only latest update per user
latest_updates = {}
for update in self.update_buffer:
latest_updates[update["user_id"]] = update
message = json.dumps({
"type": "rank_updates",
"data": list(latest_updates.values())
})
# Broadcast to all connected clients
dead_connections = set()
for ws in self.connections:
try:
await ws.send_text(message)
except:
dead_connections.add(ws)
self.connections -= dead_connections
self.update_buffer.clear()
# Background task to flush updates
async def update_flusher(broadcaster: LeaderboardBroadcaster):
while True:
await asyncio.sleep(broadcaster.buffer_interval)
await broadcaster.flush_updates()
The batching strategy here is critical. Without it, a viral moment where thousands of scores update simultaneously would create a WebSocket message storm. The 100ms buffer collapses those into digestible chunks.
Scaling Strategies
A single Redis instance handles millions of sorted set entries comfortably. But when you hit tens of millions of users or need geographic distribution, you need sharding strategies.
Time-window sharding is the most common approach: separate sorted sets for daily, weekly, monthly, and all-time leaderboards. This naturally limits set sizes and makes archival straightforward.
Score-range sharding splits users into buckets (0-1000 points, 1001-5000, etc.). This works well when score distributions are known and stable.
Geographic sharding maintains regional leaderboards with periodic global aggregation.
Here’s a partitioned leaderboard implementation:
import hashlib
class ShardedLeaderboard:
def __init__(self, redis_client: redis.Redis, num_shards: int = 16):
self.redis = redis_client
self.num_shards = num_shards
def _get_shard(self, user_id: str) -> int:
hash_val = int(hashlib.md5(user_id.encode()).hexdigest(), 16)
return hash_val % self.num_shards
def _shard_key(self, shard: int, time_window: str) -> str:
return f"leaderboard:{time_window}:shard:{shard}"
def submit_score(self, user_id: str, score: float, time_window: str = "daily"):
shard = self._get_shard(user_id)
key = self._shard_key(shard, time_window)
self.redis.zadd(key, {user_id: score}, gt=True)
def get_global_top_n(self, n: int, time_window: str = "daily") -> list:
# Aggregate top scores from all shards
# Use ZUNIONSTORE for efficiency
temp_key = f"temp:aggregated:{time_window}:{uuid.uuid4()}"
shard_keys = [
self._shard_key(i, time_window)
for i in range(self.num_shards)
]
self.redis.zunionstore(temp_key, shard_keys, aggregate='MAX')
results = self.redis.zrevrange(temp_key, 0, n - 1, withscores=True)
self.redis.delete(temp_key)
return results
def get_approximate_rank(self, user_id: str, time_window: str) -> int:
"""Returns approximate global rank using sampling"""
shard = self._get_shard(user_id)
key = self._shard_key(shard, time_window)
local_rank = self.redis.zrevrank(key, user_id)
if local_rank is None:
return None
# Approximate: assume even distribution across shards
return local_rank * self.num_shards
For tie-breaking, include a timestamp component in your score. Redis sorted sets use lexicographic ordering for equal scores, but encoding timestamp into the score gives you deterministic ordering:
def create_composite_score(score: int, timestamp: float) -> float:
# Score in integer part, inverse timestamp in decimal
# Higher scores win, earlier timestamps win ties
max_timestamp = 2000000000 # ~year 2033
return score + (max_timestamp - timestamp) / max_timestamp
Advanced Features
Relative rankings (“show me players near my rank”) are essential for engagement:
def get_neighbors(self, user_id: str, above: int = 5, below: int = 5) -> list:
rank = self.redis.zrevrank(self.leaderboard_key, user_id)
if rank is None:
return []
start = max(0, rank - above)
end = rank + below
return self.redis.zrevrange(
self.leaderboard_key, start, end, withscores=True
)
Sliding window leaderboards reset automatically using Redis key expiration:
def submit_to_sliding_window(self, user_id: str, score: float, window_hours: int = 24):
# Use hour-bucketed keys
current_hour = int(time.time() // 3600)
for hour_offset in range(window_hours):
key = f"leaderboard:hourly:{current_hour - hour_offset}"
# Only the current hour key gets new scores
if hour_offset == 0:
self.redis.zadd(key, {user_id: score}, gt=True)
self.redis.expire(key, window_hours * 3600)
Operational Considerations
Monitor these metrics religiously:
- p99 latency for ZRANK and ZREVRANGE operations
- Sorted set cardinality (ZCARD) to catch unexpected growth
- Memory usage per leaderboard key
- WebSocket connection counts and message throughput
For fraud detection, hook into your score submission pipeline. Flag statistical anomalies: score jumps that exceed reasonable thresholds, submission rates that suggest automation, or scores that cluster suspiciously. Don’t block submissions synchronously—flag and review asynchronously.
Archive historical leaderboards to PostgreSQL or a data warehouse on a schedule. Keep the hot path in Redis, but maintain queryable history for analytics and dispute resolution.
Leaderboards seem simple until they’re not. Build for the scale you expect in 18 months, not today. The patterns here will get you there.