Design a URL Shortener: System Design Interview
Before diving into architecture, nail down the requirements. Interviewers want to see you ask clarifying questions, not assume.
Key Insights
- URL shorteners are read-heavy systems (100:1 ratio) where caching strategy matters more than database optimization—a well-designed cache layer handles 90%+ of traffic
- Base62 encoding with a distributed counter (not hashing) provides guaranteed uniqueness without collision handling complexity, and 7 characters support 3.5 trillion URLs
- Choose 301 (permanent) vs 302 (temporary) redirects deliberately—this single decision affects caching behavior, analytics accuracy, and SEO implications
Problem Statement & Requirements
Before diving into architecture, nail down the requirements. Interviewers want to see you ask clarifying questions, not assume.
Functional requirements:
- Shorten a long URL to a short, unique code
- Redirect short URLs to original destinations
- Optional: Track click analytics, custom aliases, expiration
Non-functional requirements:
- Low latency redirects (< 100ms p99)
- High availability (99.9%+ uptime)
- Eventually consistent is acceptable for analytics
Back-of-envelope calculations:
Assume 100 million new URLs per month, with a 100:1 read-to-write ratio:
- Writes: ~40 URLs/second
- Reads: ~4,000 redirects/second
- Storage (5 years): 100M × 12 × 5 = 6 billion URLs
- At ~500 bytes per record: 3 TB total storage
These numbers tell us: this is a read-heavy system where caching is critical, and storage is manageable with proper partitioning.
High-Level Architecture
The system breaks into distinct components with clear responsibilities:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client │────▶│ API Gateway │────▶│ Load │
└─────────────┘ │ (Rate Limit)│ │ Balancer │
└─────────────┘ └──────┬──────┘
│
┌──────────────────────────┼──────────────────────────┐
│ │ │
┌─────▼─────┐ ┌───────▼───────┐ ┌───────▼───────┐
│ Shortening│ │ Redirect │ │ Analytics │
│ Service │ │ Service │ │ Service │
└─────┬─────┘ └───────┬───────┘ └───────────────┘
│ │
│ ┌─────▼─────┐
│ │ Cache │
│ │ (Redis) │
│ └─────┬─────┘
│ │
┌─────▼──────────────────────────▼─────┐
│ Database │
│ (Partitioned) │
└──────────────────────────────────────┘
The redirect service sits on the hot path and must be optimized aggressively. The shortening service handles writes and can tolerate slightly higher latency.
URL Encoding Strategy
You need to convert a numeric ID into a short, URL-safe string. Base62 (a-z, A-Z, 0-9) is the standard choice—no special characters that need URL encoding.
Why 7 characters? Base62^7 = 3.5 trillion possible combinations. That’s enough for decades of growth.
Counter-based vs. Hash-based:
Hash-based approaches (MD5/SHA256 truncated) seem simpler but create collision nightmares. You’d need collision detection and retry logic, adding latency and complexity.
Counter-based approaches guarantee uniqueness. Use a distributed ID generator like Twitter’s Snowflake or a dedicated sequence service.
import string
ALPHABET = string.ascii_letters + string.digits # 62 characters
BASE = len(ALPHABET)
def encode_base62(num: int) -> str:
"""Convert integer to base62 string."""
if num == 0:
return ALPHABET[0]
chars = []
while num > 0:
chars.append(ALPHABET[num % BASE])
num //= BASE
return ''.join(reversed(chars))
def decode_base62(short_code: str) -> int:
"""Convert base62 string back to integer."""
num = 0
for char in short_code:
num = num * BASE + ALPHABET.index(char)
return num
# Example usage
url_id = 123456789
short_code = encode_base62(url_id) # Returns "8m0Kx"
original_id = decode_base62(short_code) # Returns 123456789
For distributed ID generation, pre-allocate ID ranges to each service instance. Instance A gets IDs 1-1,000,000, Instance B gets 1,000,001-2,000,000, and so on. This eliminates coordination overhead during writes.
Database Design & Storage
Keep the schema minimal. Every extra column is storage overhead multiplied by billions of rows.
CREATE TABLE urls (
id BIGINT PRIMARY KEY,
short_code VARCHAR(10) NOT NULL UNIQUE,
original_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NULL,
user_id BIGINT NULL,
INDEX idx_short_code (short_code),
INDEX idx_expires_at (expires_at) WHERE expires_at IS NOT NULL
);
-- Separate table for analytics (different access pattern)
CREATE TABLE url_clicks (
id BIGINT PRIMARY KEY,
url_id BIGINT NOT NULL,
clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ip_address INET,
user_agent TEXT,
referer TEXT
);
SQL vs. NoSQL:
For this use case, both work. NoSQL (DynamoDB, Cassandra) offers easier horizontal scaling. SQL (PostgreSQL, MySQL) provides ACID guarantees and simpler querying.
My recommendation: Start with PostgreSQL. It handles this scale fine with proper indexing and partitioning. Switch to NoSQL only when you hit actual scaling limits.
Sharding strategy:
Shard by the first 1-2 characters of the short code. This distributes load evenly (Base62 is uniform) and keeps related lookups on the same shard.
def get_shard(short_code: str, num_shards: int = 64) -> int:
"""Determine which shard holds this short code."""
# Use first 2 chars for consistent hashing
prefix = short_code[:2] if len(short_code) >= 2 else short_code
return hash(prefix) % num_shards
Scaling for High Availability
With a 100:1 read-write ratio, caching is your primary scaling lever. A properly configured Redis cluster handles millions of requests per second.
Cache-aside pattern implementation:
import redis
import json
from typing import Optional
class URLService:
def __init__(self, redis_client: redis.Redis, db_connection):
self.cache = redis_client
self.db = db_connection
self.cache_ttl = 86400 # 24 hours
def get_original_url(self, short_code: str) -> Optional[str]:
"""Retrieve original URL with cache-aside pattern."""
# Try cache first
cache_key = f"url:{short_code}"
cached = self.cache.get(cache_key)
if cached:
return cached.decode('utf-8')
# Cache miss - query database
result = self.db.execute(
"SELECT original_url FROM urls WHERE short_code = %s",
(short_code,)
).fetchone()
if not result:
# Cache negative result to prevent repeated DB hits
self.cache.setex(f"url:404:{short_code}", 300, "1")
return None
original_url = result[0]
# Populate cache
self.cache.setex(cache_key, self.cache_ttl, original_url)
return original_url
def create_short_url(self, original_url: str, url_id: int) -> str:
"""Create new short URL and warm the cache."""
short_code = encode_base62(url_id)
# Write to database
self.db.execute(
"INSERT INTO urls (id, short_code, original_url) VALUES (%s, %s, %s)",
(url_id, short_code, original_url)
)
# Warm cache immediately
self.cache.setex(f"url:{short_code}", self.cache_ttl, original_url)
return short_code
Rate limiting:
Protect against abuse with token bucket rate limiting at the API gateway level:
def check_rate_limit(user_id: str, limit: int = 100, window: int = 3600) -> bool:
"""Token bucket rate limiting using Redis."""
key = f"ratelimit:{user_id}"
current = redis_client.incr(key)
if current == 1:
redis_client.expire(key, window)
return current <= limit
API Design
Keep the API simple. Two endpoints handle 99% of use cases.
from fastapi import FastAPI, HTTPException, Response
from pydantic import BaseModel, HttpUrl
app = FastAPI()
class ShortenRequest(BaseModel):
url: HttpUrl
custom_alias: str | None = None
expires_in_days: int | None = None
class ShortenResponse(BaseModel):
short_url: str
short_code: str
expires_at: str | None = None
@app.post("/api/v1/shorten", response_model=ShortenResponse)
def create_short_url(request: ShortenRequest):
"""Create a shortened URL."""
# Validate URL isn't malicious (check against blocklist)
if is_malicious_url(str(request.url)):
raise HTTPException(400, "URL not allowed")
url_id = id_generator.next_id()
short_code = url_service.create_short_url(str(request.url), url_id)
return ShortenResponse(
short_url=f"https://short.url/{short_code}",
short_code=short_code
)
@app.get("/{short_code}")
def redirect_to_url(short_code: str):
"""Redirect to original URL."""
original_url = url_service.get_original_url(short_code)
if not original_url:
raise HTTPException(404, "Short URL not found")
# Fire async analytics event
analytics_queue.send({"short_code": short_code, "timestamp": time.time()})
# 301 for permanent, 302 for temporary (affects caching)
return Response(
status_code=301,
headers={"Location": original_url}
)
301 vs. 302 redirects:
This matters more than most candidates realize:
- 301 (Permanent): Browsers cache the redirect. Faster for users, but you lose visibility into repeat visits. Better for SEO.
- 302 (Temporary): Every request hits your servers. Required if you need accurate analytics or might change the destination.
Choose 302 if analytics matter. Choose 301 if you want to minimize server load.
Follow-up Considerations
Interviewers often extend the problem. Be ready for these:
Analytics architecture: Don’t block redirects for analytics. Write click events to a message queue (Kafka, SQS) and process asynchronously. Store aggregates in a time-series database for dashboards.
Custom aliases: Add a uniqueness check before accepting custom aliases. Reserve common words and brand names. Charge premium for short custom aliases.
Security concerns:
- Scan submitted URLs against malware/phishing databases
- Implement a preview feature (short.url/abc123+) that shows destination before redirecting
- Rate limit creation by IP and user account
- Consider CAPTCHAs for anonymous URL creation
URL expiration: Run a background job that marks expired URLs as inactive. Don’t delete immediately—you might want to show “this link has expired” instead of 404.
The URL shortener is a deceptively simple system that touches on core distributed systems concepts: consistent hashing, caching strategies, database sharding, and API design. Master this design, and you’ve demonstrated competence across multiple dimensions that interviewers care about.