Design a URL Shortener: System Design Interview

Before diving into architecture, nail down the requirements. Interviewers want to see you ask clarifying questions, not assume.

Key Insights

  • URL shorteners are read-heavy systems (100:1 ratio) where caching strategy matters more than database optimization—a well-designed cache layer handles 90%+ of traffic
  • Base62 encoding with a distributed counter (not hashing) provides guaranteed uniqueness without collision handling complexity, and 7 characters support 3.5 trillion URLs
  • Choose 301 (permanent) vs 302 (temporary) redirects deliberately—this single decision affects caching behavior, analytics accuracy, and SEO implications

Problem Statement & Requirements

Before diving into architecture, nail down the requirements. Interviewers want to see you ask clarifying questions, not assume.

Functional requirements:

  • Shorten a long URL to a short, unique code
  • Redirect short URLs to original destinations
  • Optional: Track click analytics, custom aliases, expiration

Non-functional requirements:

  • Low latency redirects (< 100ms p99)
  • High availability (99.9%+ uptime)
  • Eventually consistent is acceptable for analytics

Back-of-envelope calculations:

Assume 100 million new URLs per month, with a 100:1 read-to-write ratio:

  • Writes: ~40 URLs/second
  • Reads: ~4,000 redirects/second
  • Storage (5 years): 100M × 12 × 5 = 6 billion URLs
  • At ~500 bytes per record: 3 TB total storage

These numbers tell us: this is a read-heavy system where caching is critical, and storage is manageable with proper partitioning.

High-Level Architecture

The system breaks into distinct components with clear responsibilities:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Client    │────▶│ API Gateway │────▶│ Load        │
└─────────────┘     │ (Rate Limit)│     │ Balancer    │
                    └─────────────┘     └──────┬──────┘
                    ┌──────────────────────────┼──────────────────────────┐
                    │                          │                          │
              ┌─────▼─────┐            ┌───────▼───────┐          ┌───────▼───────┐
              │ Shortening│            │   Redirect    │          │   Analytics   │
              │  Service  │            │   Service     │          │   Service     │
              └─────┬─────┘            └───────┬───────┘          └───────────────┘
                    │                          │
                    │                    ┌─────▼─────┐
                    │                    │   Cache   │
                    │                    │  (Redis)  │
                    │                    └─────┬─────┘
                    │                          │
              ┌─────▼──────────────────────────▼─────┐
              │            Database                  │
              │         (Partitioned)                │
              └──────────────────────────────────────┘

The redirect service sits on the hot path and must be optimized aggressively. The shortening service handles writes and can tolerate slightly higher latency.

URL Encoding Strategy

You need to convert a numeric ID into a short, URL-safe string. Base62 (a-z, A-Z, 0-9) is the standard choice—no special characters that need URL encoding.

Why 7 characters? Base62^7 = 3.5 trillion possible combinations. That’s enough for decades of growth.

Counter-based vs. Hash-based:

Hash-based approaches (MD5/SHA256 truncated) seem simpler but create collision nightmares. You’d need collision detection and retry logic, adding latency and complexity.

Counter-based approaches guarantee uniqueness. Use a distributed ID generator like Twitter’s Snowflake or a dedicated sequence service.

import string

ALPHABET = string.ascii_letters + string.digits  # 62 characters
BASE = len(ALPHABET)

def encode_base62(num: int) -> str:
    """Convert integer to base62 string."""
    if num == 0:
        return ALPHABET[0]
    
    chars = []
    while num > 0:
        chars.append(ALPHABET[num % BASE])
        num //= BASE
    
    return ''.join(reversed(chars))

def decode_base62(short_code: str) -> int:
    """Convert base62 string back to integer."""
    num = 0
    for char in short_code:
        num = num * BASE + ALPHABET.index(char)
    return num

# Example usage
url_id = 123456789
short_code = encode_base62(url_id)  # Returns "8m0Kx"
original_id = decode_base62(short_code)  # Returns 123456789

For distributed ID generation, pre-allocate ID ranges to each service instance. Instance A gets IDs 1-1,000,000, Instance B gets 1,000,001-2,000,000, and so on. This eliminates coordination overhead during writes.

Database Design & Storage

Keep the schema minimal. Every extra column is storage overhead multiplied by billions of rows.

CREATE TABLE urls (
    id BIGINT PRIMARY KEY,
    short_code VARCHAR(10) NOT NULL UNIQUE,
    original_url TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NULL,
    user_id BIGINT NULL,
    
    INDEX idx_short_code (short_code),
    INDEX idx_expires_at (expires_at) WHERE expires_at IS NOT NULL
);

-- Separate table for analytics (different access pattern)
CREATE TABLE url_clicks (
    id BIGINT PRIMARY KEY,
    url_id BIGINT NOT NULL,
    clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    ip_address INET,
    user_agent TEXT,
    referer TEXT
);

SQL vs. NoSQL:

For this use case, both work. NoSQL (DynamoDB, Cassandra) offers easier horizontal scaling. SQL (PostgreSQL, MySQL) provides ACID guarantees and simpler querying.

My recommendation: Start with PostgreSQL. It handles this scale fine with proper indexing and partitioning. Switch to NoSQL only when you hit actual scaling limits.

Sharding strategy:

Shard by the first 1-2 characters of the short code. This distributes load evenly (Base62 is uniform) and keeps related lookups on the same shard.

def get_shard(short_code: str, num_shards: int = 64) -> int:
    """Determine which shard holds this short code."""
    # Use first 2 chars for consistent hashing
    prefix = short_code[:2] if len(short_code) >= 2 else short_code
    return hash(prefix) % num_shards

Scaling for High Availability

With a 100:1 read-write ratio, caching is your primary scaling lever. A properly configured Redis cluster handles millions of requests per second.

Cache-aside pattern implementation:

import redis
import json
from typing import Optional

class URLService:
    def __init__(self, redis_client: redis.Redis, db_connection):
        self.cache = redis_client
        self.db = db_connection
        self.cache_ttl = 86400  # 24 hours
    
    def get_original_url(self, short_code: str) -> Optional[str]:
        """Retrieve original URL with cache-aside pattern."""
        # Try cache first
        cache_key = f"url:{short_code}"
        cached = self.cache.get(cache_key)
        
        if cached:
            return cached.decode('utf-8')
        
        # Cache miss - query database
        result = self.db.execute(
            "SELECT original_url FROM urls WHERE short_code = %s",
            (short_code,)
        ).fetchone()
        
        if not result:
            # Cache negative result to prevent repeated DB hits
            self.cache.setex(f"url:404:{short_code}", 300, "1")
            return None
        
        original_url = result[0]
        
        # Populate cache
        self.cache.setex(cache_key, self.cache_ttl, original_url)
        
        return original_url
    
    def create_short_url(self, original_url: str, url_id: int) -> str:
        """Create new short URL and warm the cache."""
        short_code = encode_base62(url_id)
        
        # Write to database
        self.db.execute(
            "INSERT INTO urls (id, short_code, original_url) VALUES (%s, %s, %s)",
            (url_id, short_code, original_url)
        )
        
        # Warm cache immediately
        self.cache.setex(f"url:{short_code}", self.cache_ttl, original_url)
        
        return short_code

Rate limiting:

Protect against abuse with token bucket rate limiting at the API gateway level:

def check_rate_limit(user_id: str, limit: int = 100, window: int = 3600) -> bool:
    """Token bucket rate limiting using Redis."""
    key = f"ratelimit:{user_id}"
    current = redis_client.incr(key)
    
    if current == 1:
        redis_client.expire(key, window)
    
    return current <= limit

API Design

Keep the API simple. Two endpoints handle 99% of use cases.

from fastapi import FastAPI, HTTPException, Response
from pydantic import BaseModel, HttpUrl

app = FastAPI()

class ShortenRequest(BaseModel):
    url: HttpUrl
    custom_alias: str | None = None
    expires_in_days: int | None = None

class ShortenResponse(BaseModel):
    short_url: str
    short_code: str
    expires_at: str | None = None

@app.post("/api/v1/shorten", response_model=ShortenResponse)
def create_short_url(request: ShortenRequest):
    """Create a shortened URL."""
    # Validate URL isn't malicious (check against blocklist)
    if is_malicious_url(str(request.url)):
        raise HTTPException(400, "URL not allowed")
    
    url_id = id_generator.next_id()
    short_code = url_service.create_short_url(str(request.url), url_id)
    
    return ShortenResponse(
        short_url=f"https://short.url/{short_code}",
        short_code=short_code
    )

@app.get("/{short_code}")
def redirect_to_url(short_code: str):
    """Redirect to original URL."""
    original_url = url_service.get_original_url(short_code)
    
    if not original_url:
        raise HTTPException(404, "Short URL not found")
    
    # Fire async analytics event
    analytics_queue.send({"short_code": short_code, "timestamp": time.time()})
    
    # 301 for permanent, 302 for temporary (affects caching)
    return Response(
        status_code=301,
        headers={"Location": original_url}
    )

301 vs. 302 redirects:

This matters more than most candidates realize:

  • 301 (Permanent): Browsers cache the redirect. Faster for users, but you lose visibility into repeat visits. Better for SEO.
  • 302 (Temporary): Every request hits your servers. Required if you need accurate analytics or might change the destination.

Choose 302 if analytics matter. Choose 301 if you want to minimize server load.

Follow-up Considerations

Interviewers often extend the problem. Be ready for these:

Analytics architecture: Don’t block redirects for analytics. Write click events to a message queue (Kafka, SQS) and process asynchronously. Store aggregates in a time-series database for dashboards.

Custom aliases: Add a uniqueness check before accepting custom aliases. Reserve common words and brand names. Charge premium for short custom aliases.

Security concerns:

  • Scan submitted URLs against malware/phishing databases
  • Implement a preview feature (short.url/abc123+) that shows destination before redirecting
  • Rate limit creation by IP and user account
  • Consider CAPTCHAs for anonymous URL creation

URL expiration: Run a background job that marks expired URLs as inactive. Don’t delete immediately—you might want to show “this link has expired” instead of 404.

The URL shortener is a deceptively simple system that touches on core distributed systems concepts: consistent hashing, caching strategies, database sharding, and API design. Master this design, and you’ve demonstrated competence across multiple dimensions that interviewers care about.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.