Design a News Feed: Social Media Feed Generation
The news feed is deceptively simple from a user's perspective: open the app, see relevant content from people you follow. Behind that simplicity lies one of the most challenging distributed systems...
Key Insights
- The push vs. pull decision fundamentally shapes your architecture: push (fan-out on write) optimizes for read latency but struggles with celebrity users, while pull (fan-out on read) handles high-follower accounts but increases read complexity
- Hybrid fan-out is the industry standard—push to active users with moderate follower counts, pull for celebrities and inactive users at read time
- Your feed ranking system will make or break user engagement; start with a simple scoring function combining recency, engagement signals, and user affinity before investing in ML pipelines
Introduction & Problem Definition
The news feed is deceptively simple from a user’s perspective: open the app, see relevant content from people you follow. Behind that simplicity lies one of the most challenging distributed systems problems in software engineering.
A news feed system must aggregate content from potentially thousands of followed accounts, rank that content by relevance, and serve it with sub-100ms latency to millions of concurrent users. Facebook processes over 100 billion feed requests daily. Twitter handles 500 million tweets per day. LinkedIn generates feeds for 900 million professionals.
The core challenges break down into three categories:
- Scale: A single user might follow 1,000 accounts, each posting multiple times daily. Multiply that by hundreds of millions of users.
- Real-time updates: Users expect fresh content within seconds of posting.
- Personalization: Chronological feeds are dead. Users expect algorithmically ranked content tailored to their interests.
This article walks through a production-ready architecture for building a news feed system that handles these challenges.
High-Level Architecture Overview
The fundamental architectural decision is how you distribute the work of feed generation: at write time (push) or at read time (pull).
Push Model (Fan-out on Write): When a user posts, immediately write that post to every follower’s feed cache. Reads become trivial—just fetch the pre-computed feed.
Pull Model (Fan-out on Read): Store posts in a per-user timeline. When a user requests their feed, query all followed accounts and merge results in real-time.
Neither approach works in isolation at scale. The industry consensus is a hybrid model:
┌─────────────────────────────────────────────────────────────────┐
│ API Gateway │
└─────────────────────────┬───────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Post Service│ │Feed Service │ │Ranking Svc │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Post DB │ │ Feed Cache │ │ ML Features │
│ (Postgres) │ │ (Redis) │ │ Store │
└─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐
│ Fan-out │◄──── Message Queue (Kafka)
│ Workers │
└─────────────┘
interface FeedService {
// Fetch paginated feed for a user
getFeed(userId: string, cursor?: string, limit?: number): Promise<FeedResponse>;
// Called when a new post is created
onPostCreated(post: Post): Promise<void>;
// Invalidate feed cache for specific users
invalidateFeeds(userIds: string[]): Promise<void>;
}
interface RankingService {
// Score and sort feed items
rankFeedItems(userId: string, items: FeedItem[]): Promise<RankedFeedItem[]>;
}
interface FanoutService {
// Distribute post to follower feeds
fanoutPost(post: Post, followerIds: string[]): Promise<void>;
}
Data Modeling & Storage
Three core data structures power the feed system:
User Graph: Who follows whom. This is read-heavy and relatively stable. Store in a graph database or denormalized in Redis for fast lookups.
Post Storage: The source of truth for all content. Use a relational database with proper indexing.
Feed Cache: Pre-computed feeds stored as sorted sets, keyed by user ID and scored by timestamp or ranking score.
-- Posts table
CREATE TABLE posts (
id UUID PRIMARY KEY,
author_id UUID NOT NULL REFERENCES users(id),
content TEXT NOT NULL,
media_urls TEXT[],
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
engagement_score FLOAT DEFAULT 0,
INDEX idx_author_created (author_id, created_at DESC)
);
-- Followers table (adjacency list)
CREATE TABLE follows (
follower_id UUID NOT NULL REFERENCES users(id),
followee_id UUID NOT NULL REFERENCES users(id),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id),
INDEX idx_followee (followee_id)
);
Redis sorted sets are ideal for feed storage. The score represents the ranking value (often timestamp for chronological, or a computed relevance score):
import redis
from typing import List, Optional
class FeedCache:
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.feed_size_limit = 1000 # Keep last 1000 items per user
def add_to_feed(self, user_id: str, post_id: str, score: float) -> None:
"""Add a post to a user's feed cache."""
key = f"feed:{user_id}"
pipeline = self.redis.pipeline()
pipeline.zadd(key, {post_id: score})
pipeline.zremrangebyrank(key, 0, -self.feed_size_limit - 1) # Trim old items
pipeline.expire(key, 86400 * 7) # 7-day TTL
pipeline.execute()
def get_feed(self, user_id: str, offset: int = 0, limit: int = 20) -> List[str]:
"""Retrieve feed items, highest score first."""
key = f"feed:{user_id}"
return self.redis.zrevrange(key, offset, offset + limit - 1)
def bulk_add_to_feeds(self, user_ids: List[str], post_id: str, score: float) -> None:
"""Fan-out post to multiple user feeds."""
pipeline = self.redis.pipeline()
for user_id in user_ids:
key = f"feed:{user_id}"
pipeline.zadd(key, {post_id: score})
pipeline.execute()
Feed Generation Pipeline
The fan-out worker is the heart of the push model. When a post is created, it’s published to a message queue. Workers consume these messages and distribute posts to follower feeds.
from dataclasses import dataclass
from typing import List
import json
@dataclass
class PostCreatedEvent:
post_id: str
author_id: str
created_at: float
is_celebrity: bool # Authors with >100k followers
class FanoutWorker:
CELEBRITY_THRESHOLD = 100_000
BATCH_SIZE = 1000
def __init__(self, feed_cache: FeedCache, user_service, kafka_consumer):
self.feed_cache = feed_cache
self.user_service = user_service
self.consumer = kafka_consumer
def process_post_event(self, event: PostCreatedEvent) -> None:
"""Handle fan-out for a new post."""
# Skip fan-out for celebrity users—handled at read time
if event.is_celebrity:
self._store_in_celebrity_posts(event)
return
# Get all followers
followers = self.user_service.get_followers(event.author_id)
# Filter to active users only (logged in within 7 days)
active_followers = self.user_service.filter_active_users(followers)
# Batch the fan-out to avoid overwhelming Redis
for batch in self._chunk(active_followers, self.BATCH_SIZE):
self.feed_cache.bulk_add_to_feeds(
user_ids=batch,
post_id=event.post_id,
score=event.created_at
)
def _store_in_celebrity_posts(self, event: PostCreatedEvent) -> None:
"""Store celebrity posts separately for pull-based retrieval."""
key = f"celebrity_posts:{event.author_id}"
self.feed_cache.redis.zadd(key, {event.post_id: event.created_at})
def _chunk(self, items: List, size: int):
for i in range(0, len(items), size):
yield items[i:i + size]
def run(self):
"""Main worker loop."""
for message in self.consumer:
event = PostCreatedEvent(**json.loads(message.value))
self.process_post_event(event)
For the message queue, Kafka provides the durability and throughput needed:
from kafka import KafkaProducer, KafkaConsumer
import json
class PostEventProducer:
def __init__(self, bootstrap_servers: List[str]):
self.producer = KafkaProducer(
bootstrap_servers=bootstrap_servers,
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
acks='all', # Wait for all replicas
retries=3
)
def publish_post_created(self, post: Post) -> None:
event = {
'post_id': post.id,
'author_id': post.author_id,
'created_at': post.created_at.timestamp(),
'is_celebrity': post.author.follower_count > 100_000
}
# Partition by author_id for ordering guarantees
self.producer.send(
'post-created',
value=event,
key=post.author_id.encode('utf-8')
)
Ranking & Personalization
Chronological feeds are simple but lead to poor engagement. Modern feeds rank content by predicted relevance. Start with a simple scoring function before investing in ML:
from dataclasses import dataclass
from typing import List
import time
@dataclass
class FeedItem:
post_id: str
author_id: str
created_at: float
like_count: int
comment_count: int
share_count: int
@dataclass
class UserAffinityScore:
author_id: str
score: float # 0.0 to 1.0, based on past interactions
class SimpleRanker:
# Weights for different signals
RECENCY_WEIGHT = 0.3
ENGAGEMENT_WEIGHT = 0.25
AFFINITY_WEIGHT = 0.45
# Time decay: half-life of 6 hours
HALF_LIFE_SECONDS = 6 * 3600
def __init__(self, affinity_service):
self.affinity_service = affinity_service
def rank_items(self, user_id: str, items: List[FeedItem]) -> List[FeedItem]:
"""Score and sort feed items by relevance."""
# Batch fetch affinity scores
author_ids = [item.author_id for item in items]
affinities = self.affinity_service.get_affinities(user_id, author_ids)
affinity_map = {a.author_id: a.score for a in affinities}
scored_items = []
for item in items:
score = self._compute_score(item, affinity_map.get(item.author_id, 0.1))
scored_items.append((score, item))
# Sort by score descending
scored_items.sort(key=lambda x: x[0], reverse=True)
return [item for _, item in scored_items]
def _compute_score(self, item: FeedItem, affinity: float) -> float:
# Recency score with exponential decay
age_seconds = time.time() - item.created_at
recency_score = 2 ** (-age_seconds / self.HALF_LIFE_SECONDS)
# Engagement score (log scale to prevent viral posts from dominating)
engagement = item.like_count + 2 * item.comment_count + 3 * item.share_count
engagement_score = min(1.0, log(1 + engagement) / 10)
# Weighted combination
return (
self.RECENCY_WEIGHT * recency_score +
self.ENGAGEMENT_WEIGHT * engagement_score +
self.AFFINITY_WEIGHT * affinity
)
Scaling & Performance Optimization
Cursor-based pagination is essential for feeds. Offset-based pagination breaks when new items are added between requests:
import base64
import json
from dataclasses import dataclass
from typing import Optional, List, Tuple
@dataclass
class FeedCursor:
last_score: float
last_post_id: str
def encode(self) -> str:
data = {'s': self.last_score, 'p': self.last_post_id}
return base64.urlsafe_b64encode(json.dumps(data).encode()).decode()
@classmethod
def decode(cls, cursor_str: str) -> 'FeedCursor':
data = json.loads(base64.urlsafe_b64decode(cursor_str))
return cls(last_score=data['s'], last_post_id=data['p'])
class PaginatedFeedService:
def __init__(self, feed_cache: FeedCache, post_repository, ranker: SimpleRanker):
self.feed_cache = feed_cache
self.post_repo = post_repository
self.ranker = ranker
def get_feed(
self,
user_id: str,
cursor: Optional[str] = None,
limit: int = 20
) -> Tuple[List[Post], Optional[str]]:
"""Fetch paginated feed with cursor."""
# Fetch more than needed to account for filtering
fetch_limit = limit * 2
if cursor:
decoded = FeedCursor.decode(cursor)
# Use ZREVRANGEBYSCORE for cursor-based pagination
post_ids = self.feed_cache.redis.zrevrangebyscore(
f"feed:{user_id}",
max=f"({decoded.last_score}", # Exclusive
min="-inf",
start=0,
num=fetch_limit,
withscores=True
)
else:
post_ids = self.feed_cache.redis.zrevrange(
f"feed:{user_id}",
0,
fetch_limit - 1,
withscores=True
)
if not post_ids:
return [], None
# Hydrate posts from database
posts = self.post_repo.get_by_ids([pid for pid, _ in post_ids])
# Build next cursor from last item
result_posts = posts[:limit]
next_cursor = None
if len(posts) > limit:
last_post = result_posts[-1]
last_score = next(s for pid, s in post_ids if pid == last_post.id)
next_cursor = FeedCursor(last_score, last_post.id).encode()
return result_posts, next_cursor
Conclusion & Trade-offs Summary
Building a news feed system requires balancing competing concerns:
Push vs. Pull: Pure push fails for celebrity accounts (imagine fanning out to 50 million followers). Pure pull creates read-time latency spikes. Hybrid approaches—push for normal users, pull for celebrities—are the industry standard.
Cost vs. Latency: Pre-computing feeds uses more storage and write bandwidth but delivers sub-10ms read latency. Pull-based approaches save storage but require complex read-time aggregation.
Simplicity vs. Personalization: Chronological feeds are simple but underperform on engagement. ML-based ranking improves engagement but adds operational complexity.
Start simple: implement push-based fan-out with a basic scoring function. Measure everything—cache hit rates, p99 latencies, fan-out queue depths. Add complexity only when metrics justify it.
The architecture described here handles millions of users. Beyond that scale, you’ll need additional optimizations: geographic distribution, tiered caching, and sophisticated ML ranking pipelines. But those are problems you want to have.