Long Polling: Server Push Simulation

HTTP was designed as a request-response protocol. Clients ask, servers answer. This works beautifully for fetching web pages but falls apart when servers need to notify clients about events—new...

Key Insights

  • Long polling simulates server push by holding HTTP connections open until data is available, providing near-real-time updates without WebSocket complexity or browser compatibility concerns.
  • The technique requires careful timeout management on both client and server sides, plus infrastructure awareness of load balancer limits and connection pooling.
  • While WebSockets and SSE have largely superseded long polling for new projects, understanding this pattern remains valuable for legacy systems, constrained environments, and as a fallback mechanism.

The Push Problem

HTTP was designed as a request-response protocol. Clients ask, servers answer. This works beautifully for fetching web pages but falls apart when servers need to notify clients about events—new messages, price updates, system alerts.

Traditional polling solves this crudely: the client asks “anything new?” every few seconds. It works, but it’s wasteful. Most requests return empty responses, yet each one consumes bandwidth, server resources, and battery life on mobile devices.

Long polling flips the script. Instead of the client repeatedly asking and the server immediately answering “no,” the server holds the connection open until it actually has something to say. The client gets data the moment it’s available, with none of the wasted round trips.

This technique bridged the gap between naive polling and WebSockets for years. Facebook’s chat, Gmail’s notifications, and countless real-time features relied on long polling before WebSockets achieved universal browser support. Understanding it remains relevant—both for maintaining legacy systems and as a fallback when WebSockets aren’t viable.

How Long Polling Works

The mechanics are deceptively simple:

  1. Client sends an HTTP request to the server
  2. Server receives the request but doesn’t respond immediately
  3. Server holds the connection open, waiting for relevant data
  4. When data becomes available (or a timeout occurs), server sends the response
  5. Client processes the response and immediately sends a new request
  6. Cycle repeats
Client                          Server
  |                               |
  |-------- HTTP Request -------->|
  |                               | (waiting for data...)
  |                               | (still waiting...)
  |                               | (data arrives!)
  |<------- HTTP Response --------|
  |                               |
  |-------- HTTP Request -------->|  (immediate reconnect)
  |                               | (waiting again...)

The key insight: from the client’s perspective, this looks like a series of slow HTTP requests. No special protocols, no upgrade handshakes, no firewall issues. It’s just HTTP, which means it works everywhere HTTP works.

The timeout is crucial. Without it, connections would hang indefinitely when no data arrives. Typically, servers respond after 20-30 seconds even with no data, signaling the client to reconnect. This prevents proxy timeouts and keeps the connection fresh.

Server-Side Implementation

A long polling endpoint needs to manage waiting connections and dispatch events to them when data arrives. Here’s a practical implementation in Node.js:

const express = require('express');
const { EventEmitter } = require('events');

const app = express();
const eventBus = new EventEmitter();
eventBus.setMaxListeners(1000); // Expect many concurrent connections

const LONG_POLL_TIMEOUT = 30000; // 30 seconds

// Store for pending messages per user
const messageQueues = new Map();

app.get('/api/poll/:userId', async (req, res) => {
  const { userId } = req.params;
  const startTime = Date.now();
  
  // Check for already-queued messages
  if (messageQueues.has(userId)) {
    const messages = messageQueues.get(userId);
    if (messages.length > 0) {
      const batch = messages.splice(0, 10); // Return up to 10 messages
      return res.json({ messages: batch, timestamp: Date.now() });
    }
  }
  
  // Set up event listener for this connection
  const messageHandler = (data) => {
    if (data.targetUserId === userId) {
      res.json({ messages: [data.message], timestamp: Date.now() });
      cleanup();
    }
  };
  
  // Set up timeout
  const timeoutId = setTimeout(() => {
    res.json({ messages: [], timestamp: Date.now() });
    cleanup();
  }, LONG_POLL_TIMEOUT);
  
  // Handle client disconnect
  const cleanup = () => {
    clearTimeout(timeoutId);
    eventBus.removeListener('newMessage', messageHandler);
  };
  
  req.on('close', cleanup);
  eventBus.on('newMessage', messageHandler);
});

// Endpoint to trigger events (e.g., sending a message)
app.post('/api/send', express.json(), (req, res) => {
  const { targetUserId, message } = req.body;
  
  eventBus.emit('newMessage', { targetUserId, message });
  res.json({ success: true });
});

app.listen(3000);

Critical implementation details:

Cleanup is mandatory. Every held connection must have cleanup logic for both successful responses and client disconnects. Memory leaks from orphaned event listeners will crash your server.

Timeout before infrastructure does. Load balancers, proxies, and browsers all have their own timeout limits. Your application timeout should be shorter than all of them. If your load balancer times out at 60 seconds, set your long poll timeout to 30.

Batch when possible. If multiple events arrive while a connection is waiting, batch them into a single response rather than responding to each individually.

Client-Side Implementation

The client needs to maintain a persistent polling loop with robust error handling:

class LongPollClient {
  constructor(userId, onMessages) {
    this.userId = userId;
    this.onMessages = onMessages;
    this.isRunning = false;
    this.retryDelay = 1000;
    this.maxRetryDelay = 30000;
    this.abortController = null;
  }

  start() {
    this.isRunning = true;
    this.retryDelay = 1000;
    this.poll();
  }

  stop() {
    this.isRunning = false;
    if (this.abortController) {
      this.abortController.abort();
    }
  }

  async poll() {
    if (!this.isRunning) return;

    this.abortController = new AbortController();
    const timeoutId = setTimeout(() => this.abortController.abort(), 35000);

    try {
      const response = await fetch(`/api/poll/${this.userId}`, {
        signal: this.abortController.signal,
        headers: { 'Cache-Control': 'no-cache' }
      });

      clearTimeout(timeoutId);

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}`);
      }

      const data = await response.json();
      
      // Reset retry delay on success
      this.retryDelay = 1000;
      
      if (data.messages && data.messages.length > 0) {
        this.onMessages(data.messages);
      }

      // Immediately reconnect
      this.poll();

    } catch (error) {
      clearTimeout(timeoutId);
      
      if (error.name === 'AbortError') {
        // Timeout or manual stop - reconnect if still running
        if (this.isRunning) {
          this.poll();
        }
        return;
      }

      console.error('Long poll error:', error.message);
      
      // Exponential backoff for errors
      await this.delay(this.retryDelay);
      this.retryDelay = Math.min(this.retryDelay * 2, this.maxRetryDelay);
      
      if (this.isRunning) {
        this.poll();
      }
    }
  }

  delay(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage
const client = new LongPollClient('user123', (messages) => {
  messages.forEach(msg => console.log('Received:', msg));
});

client.start();

The exponential backoff is essential. Without it, a server outage triggers a thundering herd of reconnection attempts that can prevent recovery. Start at 1 second, double on each failure, cap at 30 seconds.

Scaling Challenges and Solutions

Long polling’s elegance hides serious scaling challenges:

Connection limits. Each waiting client holds an open connection. A single Node.js process might handle 10,000 concurrent connections, but that’s 10,000 file descriptors, 10,000 event listeners, and corresponding memory overhead.

Load balancer timeouts. AWS ALB defaults to 60 seconds. Nginx defaults to 60 seconds. If your long poll timeout exceeds these, connections get killed mid-wait.

Horizontal scaling. With multiple server instances, a user might connect to Server A while their message arrives at Server B. You need a message broker to coordinate.

Redis pub/sub solves the multi-server problem elegantly:

const Redis = require('ioredis');

const subscriber = new Redis();
const publisher = new Redis();

// Subscribe to user-specific channels
const userSubscriptions = new Map();

async function subscribeUser(userId, callback) {
  const channel = `user:${userId}:messages`;
  
  if (!userSubscriptions.has(channel)) {
    userSubscriptions.set(channel, new Set());
    await subscriber.subscribe(channel);
  }
  
  userSubscriptions.get(channel).add(callback);
  
  return () => {
    const callbacks = userSubscriptions.get(channel);
    callbacks.delete(callback);
    if (callbacks.size === 0) {
      subscriber.unsubscribe(channel);
      userSubscriptions.delete(channel);
    }
  };
}

subscriber.on('message', (channel, message) => {
  const callbacks = userSubscriptions.get(channel);
  if (callbacks) {
    const data = JSON.parse(message);
    callbacks.forEach(cb => cb(data));
  }
});

// Publishing from any server instance
async function sendMessage(targetUserId, message) {
  await publisher.publish(
    `user:${targetUserId}:messages`,
    JSON.stringify(message)
  );
}

Now any server can publish messages, and only the server holding the user’s connection responds.

Long Polling vs. Alternatives

Approach Latency Complexity Browser Support Firewall Friendly
Traditional Polling High (interval-based) Low Universal Yes
Long Polling Low Medium Universal Yes
Server-Sent Events Low Low Modern browsers Usually
WebSockets Lowest High Modern browsers Sometimes

Choose traditional polling when updates are infrequent (hourly) and latency doesn’t matter.

Choose long polling when you need broad compatibility, work behind restrictive firewalls, or maintain legacy systems.

Choose SSE for server-to-client streaming where you don’t need bidirectional communication. It’s simpler than long polling and has native browser support.

Choose WebSockets for bidirectional real-time communication where latency is critical and you control the infrastructure.

Production Considerations

Monitor your long polling infrastructure aggressively:

// Health check endpoint exposing connection metrics
const connectionMetrics = {
  active: 0,
  totalServed: 0,
  errors: 0
};

app.get('/health/longpoll', (req, res) => {
  res.json({
    status: 'healthy',
    connections: {
      active: connectionMetrics.active,
      totalServed: connectionMetrics.totalServed,
      errorRate: connectionMetrics.errors / connectionMetrics.totalServed
    },
    uptime: process.uptime(),
    memory: process.memoryUsage()
  });
});

Watch for connection count creep—it often indicates cleanup bugs. Monitor memory usage; it should correlate linearly with connection count.

Set your timeouts conservatively. I recommend 25-30 seconds for the server timeout, 35 seconds for the client timeout, and ensuring your load balancer timeout exceeds both.

Long polling served the web well for a decade. While WebSockets and SSE handle most modern use cases better, long polling remains a reliable fallback and a pattern worth understanding. Sometimes the simplest solution that works everywhere beats the elegant solution that works almost everywhere.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.