Design a News Feed: Social Media Feed Generation

Key Insights

The push vs. pull decision fundamentally shapes your architecture: push (fan-out on write) optimizes for read latency but struggles with celebrity users, while pull (fan-out on read) handles high-follower accounts but increases read complexity
Hybrid fan-out is the industry standard—push to active users with moderate follower counts, pull for celebrities and inactive users at read time
Your feed ranking system will make or break user engagement; start with a simple scoring function combining recency, engagement signals, and user affinity before investing in ML pipelines

Introduction & Problem Definition

The news feed is deceptively simple from a user’s perspective: open the app, see relevant content from people you follow. Behind that simplicity lies one of the most challenging distributed systems problems in software engineering.

A news feed system must aggregate content from potentially thousands of followed accounts, rank that content by relevance, and serve it with sub-100ms latency to millions of concurrent users. Facebook processes over 100 billion feed requests daily. Twitter handles 500 million tweets per day. LinkedIn generates feeds for 900 million professionals.

The core challenges break down into three categories:

Scale: A single user might follow 1,000 accounts, each posting multiple times daily. Multiply that by hundreds of millions of users.
Real-time updates: Users expect fresh content within seconds of posting.
Personalization: Chronological feeds are dead. Users expect algorithmically ranked content tailored to their interests.

This article walks through a production-ready architecture for building a news feed system that handles these challenges.

High-Level Architecture Overview

The fundamental architectural decision is how you distribute the work of feed generation: at write time (push) or at read time (pull).

Push Model (Fan-out on Write): When a user posts, immediately write that post to every follower’s feed cache. Reads become trivial—just fetch the pre-computed feed.

Pull Model (Fan-out on Read): Store posts in a per-user timeline. When a user requests their feed, query all followed accounts and merge results in real-time.

Neither approach works in isolation at scale. The industry consensus is a hybrid model:

┌─────────────────────────────────────────────────────────────────┐
│                        API Gateway                               │
└─────────────────────────┬───────────────────────────────────────┘
                          │
          ┌───────────────┼───────────────┐
          ▼               ▼               ▼
   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
   │ Post Service│ │Feed Service │ │Ranking Svc  │
   └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
          │               │               │
          ▼               ▼               ▼
   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
   │  Post DB    │ │ Feed Cache  │ │ ML Features │
   │ (Postgres)  │ │  (Redis)    │ │   Store     │
   └─────────────┘ └─────────────┘ └─────────────┘
          │
          ▼
   ┌─────────────┐
   │ Fan-out     │◄──── Message Queue (Kafka)
   │ Workers     │
   └─────────────┘

interface FeedService {
  // Fetch paginated feed for a user
  getFeed(userId: string, cursor?: string, limit?: number): Promise<FeedResponse>;
  
  // Called when a new post is created
  onPostCreated(post: Post): Promise<void>;
  
  // Invalidate feed cache for specific users
  invalidateFeeds(userIds: string[]): Promise<void>;
}

interface RankingService {
  // Score and sort feed items
  rankFeedItems(userId: string, items: FeedItem[]): Promise<RankedFeedItem[]>;
}

interface FanoutService {
  // Distribute post to follower feeds
  fanoutPost(post: Post, followerIds: string[]): Promise<void>;
}

Data Modeling & Storage

Three core data structures power the feed system:

User Graph: Who follows whom. This is read-heavy and relatively stable. Store in a graph database or denormalized in Redis for fast lookups.

Post Storage: The source of truth for all content. Use a relational database with proper indexing.

Feed Cache: Pre-computed feeds stored as sorted sets, keyed by user ID and scored by timestamp or ranking score.

-- Posts table
CREATE TABLE posts (
    id UUID PRIMARY KEY,
    author_id UUID NOT NULL REFERENCES users(id),
    content TEXT NOT NULL,
    media_urls TEXT[],
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    engagement_score FLOAT DEFAULT 0,
    INDEX idx_author_created (author_id, created_at DESC)
);

-- Followers table (adjacency list)
CREATE TABLE follows (
    follower_id UUID NOT NULL REFERENCES users(id),
    followee_id UUID NOT NULL REFERENCES users(id),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_followee (followee_id)
);

Redis sorted sets are ideal for feed storage. The score represents the ranking value (often timestamp for chronological, or a computed relevance score):

import redis
from typing import List, Optional

class FeedCache:
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
        self.feed_size_limit = 1000  # Keep last 1000 items per user
    
    def add_to_feed(self, user_id: str, post_id: str, score: float) -> None:
        """Add a post to a user's feed cache."""
        key = f"feed:{user_id}"
        pipeline = self.redis.pipeline()
        pipeline.zadd(key, {post_id: score})
        pipeline.zremrangebyrank(key, 0, -self.feed_size_limit - 1)  # Trim old items
        pipeline.expire(key, 86400 * 7)  # 7-day TTL
        pipeline.execute()
    
    def get_feed(self, user_id: str, offset: int = 0, limit: int = 20) -> List[str]:
        """Retrieve feed items, highest score first."""
        key = f"feed:{user_id}"
        return self.redis.zrevrange(key, offset, offset + limit - 1)
    
    def bulk_add_to_feeds(self, user_ids: List[str], post_id: str, score: float) -> None:
        """Fan-out post to multiple user feeds."""
        pipeline = self.redis.pipeline()
        for user_id in user_ids:
            key = f"feed:{user_id}"
            pipeline.zadd(key, {post_id: score})
        pipeline.execute()

Feed Generation Pipeline

The fan-out worker is the heart of the push model. When a post is created, it’s published to a message queue. Workers consume these messages and distribute posts to follower feeds.

from dataclasses import dataclass
from typing import List
import json

@dataclass
class PostCreatedEvent:
    post_id: str
    author_id: str
    created_at: float
    is_celebrity: bool  # Authors with >100k followers

class FanoutWorker:
    CELEBRITY_THRESHOLD = 100_000
    BATCH_SIZE = 1000
    
    def __init__(self, feed_cache: FeedCache, user_service, kafka_consumer):
        self.feed_cache = feed_cache
        self.user_service = user_service
        self.consumer = kafka_consumer
    
    def process_post_event(self, event: PostCreatedEvent) -> None:
        """Handle fan-out for a new post."""
        
        # Skip fan-out for celebrity users—handled at read time
        if event.is_celebrity:
            self._store_in_celebrity_posts(event)
            return
        
        # Get all followers
        followers = self.user_service.get_followers(event.author_id)
        
        # Filter to active users only (logged in within 7 days)
        active_followers = self.user_service.filter_active_users(followers)
        
        # Batch the fan-out to avoid overwhelming Redis
        for batch in self._chunk(active_followers, self.BATCH_SIZE):
            self.feed_cache.bulk_add_to_feeds(
                user_ids=batch,
                post_id=event.post_id,
                score=event.created_at
            )
    
    def _store_in_celebrity_posts(self, event: PostCreatedEvent) -> None:
        """Store celebrity posts separately for pull-based retrieval."""
        key = f"celebrity_posts:{event.author_id}"
        self.feed_cache.redis.zadd(key, {event.post_id: event.created_at})
    
    def _chunk(self, items: List, size: int):
        for i in range(0, len(items), size):
            yield items[i:i + size]

    def run(self):
        """Main worker loop."""
        for message in self.consumer:
            event = PostCreatedEvent(**json.loads(message.value))
            self.process_post_event(event)

For the message queue, Kafka provides the durability and throughput needed:

from kafka import KafkaProducer, KafkaConsumer
import json

class PostEventProducer:
    def __init__(self, bootstrap_servers: List[str]):
        self.producer = KafkaProducer(
            bootstrap_servers=bootstrap_servers,
            value_serializer=lambda v: json.dumps(v).encode('utf-8'),
            acks='all',  # Wait for all replicas
            retries=3
        )
    
    def publish_post_created(self, post: Post) -> None:
        event = {
            'post_id': post.id,
            'author_id': post.author_id,
            'created_at': post.created_at.timestamp(),
            'is_celebrity': post.author.follower_count > 100_000
        }
        # Partition by author_id for ordering guarantees
        self.producer.send(
            'post-created',
            value=event,
            key=post.author_id.encode('utf-8')
        )

Ranking & Personalization

Chronological feeds are simple but lead to poor engagement. Modern feeds rank content by predicted relevance. Start with a simple scoring function before investing in ML:

from dataclasses import dataclass
from typing import List
import time

@dataclass
class FeedItem:
    post_id: str
    author_id: str
    created_at: float
    like_count: int
    comment_count: int
    share_count: int

@dataclass 
class UserAffinityScore:
    author_id: str
    score: float  # 0.0 to 1.0, based on past interactions

class SimpleRanker:
    # Weights for different signals
    RECENCY_WEIGHT = 0.3
    ENGAGEMENT_WEIGHT = 0.25
    AFFINITY_WEIGHT = 0.45
    
    # Time decay: half-life of 6 hours
    HALF_LIFE_SECONDS = 6 * 3600
    
    def __init__(self, affinity_service):
        self.affinity_service = affinity_service
    
    def rank_items(self, user_id: str, items: List[FeedItem]) -> List[FeedItem]:
        """Score and sort feed items by relevance."""
        # Batch fetch affinity scores
        author_ids = [item.author_id for item in items]
        affinities = self.affinity_service.get_affinities(user_id, author_ids)
        affinity_map = {a.author_id: a.score for a in affinities}
        
        scored_items = []
        for item in items:
            score = self._compute_score(item, affinity_map.get(item.author_id, 0.1))
            scored_items.append((score, item))
        
        # Sort by score descending
        scored_items.sort(key=lambda x: x[0], reverse=True)
        return [item for _, item in scored_items]
    
    def _compute_score(self, item: FeedItem, affinity: float) -> float:
        # Recency score with exponential decay
        age_seconds = time.time() - item.created_at
        recency_score = 2 ** (-age_seconds / self.HALF_LIFE_SECONDS)
        
        # Engagement score (log scale to prevent viral posts from dominating)
        engagement = item.like_count + 2 * item.comment_count + 3 * item.share_count
        engagement_score = min(1.0, log(1 + engagement) / 10)
        
        # Weighted combination
        return (
            self.RECENCY_WEIGHT * recency_score +
            self.ENGAGEMENT_WEIGHT * engagement_score +
            self.AFFINITY_WEIGHT * affinity
        )

Scaling & Performance Optimization

Cursor-based pagination is essential for feeds. Offset-based pagination breaks when new items are added between requests:

import base64
import json
from dataclasses import dataclass
from typing import Optional, List, Tuple

@dataclass
class FeedCursor:
    last_score: float
    last_post_id: str
    
    def encode(self) -> str:
        data = {'s': self.last_score, 'p': self.last_post_id}
        return base64.urlsafe_b64encode(json.dumps(data).encode()).decode()
    
    @classmethod
    def decode(cls, cursor_str: str) -> 'FeedCursor':
        data = json.loads(base64.urlsafe_b64decode(cursor_str))
        return cls(last_score=data['s'], last_post_id=data['p'])

class PaginatedFeedService:
    def __init__(self, feed_cache: FeedCache, post_repository, ranker: SimpleRanker):
        self.feed_cache = feed_cache
        self.post_repo = post_repository
        self.ranker = ranker
    
    def get_feed(
        self, 
        user_id: str, 
        cursor: Optional[str] = None, 
        limit: int = 20
    ) -> Tuple[List[Post], Optional[str]]:
        """Fetch paginated feed with cursor."""
        
        # Fetch more than needed to account for filtering
        fetch_limit = limit * 2
        
        if cursor:
            decoded = FeedCursor.decode(cursor)
            # Use ZREVRANGEBYSCORE for cursor-based pagination
            post_ids = self.feed_cache.redis.zrevrangebyscore(
                f"feed:{user_id}",
                max=f"({decoded.last_score}",  # Exclusive
                min="-inf",
                start=0,
                num=fetch_limit,
                withscores=True
            )
        else:
            post_ids = self.feed_cache.redis.zrevrange(
                f"feed:{user_id}",
                0,
                fetch_limit - 1,
                withscores=True
            )
        
        if not post_ids:
            return [], None
        
        # Hydrate posts from database
        posts = self.post_repo.get_by_ids([pid for pid, _ in post_ids])
        
        # Build next cursor from last item
        result_posts = posts[:limit]
        next_cursor = None
        if len(posts) > limit:
            last_post = result_posts[-1]
            last_score = next(s for pid, s in post_ids if pid == last_post.id)
            next_cursor = FeedCursor(last_score, last_post.id).encode()
        
        return result_posts, next_cursor

Conclusion & Trade-offs Summary

Building a news feed system requires balancing competing concerns:

Push vs. Pull: Pure push fails for celebrity accounts (imagine fanning out to 50 million followers). Pure pull creates read-time latency spikes. Hybrid approaches—push for normal users, pull for celebrities—are the industry standard.

Cost vs. Latency: Pre-computing feeds uses more storage and write bandwidth but delivers sub-10ms read latency. Pull-based approaches save storage but require complex read-time aggregation.

Simplicity vs. Personalization: Chronological feeds are simple but underperform on engagement. ML-based ranking improves engagement but adds operational complexity.

Start simple: implement push-based fan-out with a basic scoring function. Measure everything—cache hit rates, p99 latencies, fan-out queue depths. Add complexity only when metrics justify it.

The architecture described here handles millions of users. Beyond that scale, you’ll need additional optimizations: geographic distribution, tiered caching, and sophisticated ML ranking pipelines. But those are problems you want to have.