HTTP Caching: Cache-Control, ETag, and Last-Modified

HTTP caching is one of the most effective performance optimizations you can implement, yet it's frequently misconfigured or ignored entirely. Proper caching reduces server load, decreases bandwidth...

Key Insights

  • Cache-Control directives determine how long browsers and CDNs store resources, with max-age setting expiration time and no-store preventing caching entirely—choose based on content mutability.
  • ETags enable efficient validation by generating content fingerprints, allowing servers to return 304 Not Modified responses when content hasn’t changed, saving bandwidth without sacrificing freshness.
  • Combining Cache-Control for expiration with ETags or Last-Modified for validation creates a two-tier caching strategy that balances performance with content accuracy across different resource types.

Introduction to HTTP Caching

HTTP caching is one of the most effective performance optimizations you can implement, yet it’s frequently misconfigured or ignored entirely. Proper caching reduces server load, decreases bandwidth consumption, and dramatically improves user experience by serving resources instantly from local storage.

The performance gains are substantial. A cached resource loads in milliseconds instead of hundreds of milliseconds or seconds. For users on slow connections or mobile networks, caching can mean the difference between a usable application and an unusable one.

Three primary mechanisms control HTTP caching: Cache-Control headers define caching policies, ETags provide content-based validation, and Last-Modified headers enable time-based validation. Understanding when and how to use each is essential for building performant web applications.

Browser caches store resources locally on the user’s device, while CDN caches store resources on edge servers distributed globally. Both use the same HTTP caching headers, but their strategic purposes differ—browser caches optimize for individual users, while CDN caches optimize for geographic distribution and reduced origin server load.

Cache-Control Header Fundamentals

The Cache-Control header is your primary tool for defining caching behavior. It uses directives that tell browsers and intermediary caches how to handle a resource.

The most important directives:

  • max-age=<seconds>: Specifies how long a resource remains fresh
  • no-cache: Forces validation with the origin server before using cached content
  • no-store: Prevents any caching whatsoever
  • public: Allows shared caches (CDNs) to store the resource
  • private: Restricts caching to the browser only
  • immutable: Indicates the resource will never change (useful for versioned assets)

Here’s how to implement different caching strategies in Express.js:

const express = require('express');
const app = express();

// Static assets with versioned filenames - cache aggressively
app.use('/static', (req, res, next) => {
  // Assets like app.a3f2b1.js should be cached forever
  if (req.url.match(/\.(js|css|png|jpg|jpeg|gif|svg|woff2)$/)) {
    res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
  }
  next();
}, express.static('public'));

// API responses - don't cache or validate every time
app.get('/api/user/:id', (req, res) => {
  res.setHeader('Cache-Control', 'private, no-cache');
  // Fetch and return user data
  res.json({ id: req.params.id, name: 'John Doe' });
});

// Sensitive data - never cache
app.get('/api/account/balance', (req, res) => {
  res.setHeader('Cache-Control', 'no-store');
  res.json({ balance: 1000.50 });
});

// HTML pages - short cache with validation
app.get('/', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=300, must-revalidate');
  res.sendFile(__dirname + '/index.html');
});

Use max-age for resources that change infrequently. Use no-cache when you want caching but need validation on every request. Use no-store for sensitive data that should never be cached. The immutable directive is perfect for content-hashed assets that will literally never change.

ETag: Content-Based Validation

ETags (entity tags) are identifiers representing a specific version of a resource. When content changes, the ETag changes. This enables efficient validation—the client sends the ETag back to the server, and the server returns 304 Not Modified if the content hasn’t changed, avoiding a full response body.

Strong ETags (default) indicate byte-for-byte identical resources. Weak ETags (prefixed with W/) indicate semantic equivalence but allow minor differences like whitespace changes.

The validation flow works like this: the server sends an ETag with the initial response, the browser stores it, and on subsequent requests, the browser sends an If-None-Match header containing the ETag. The server compares it with the current resource’s ETag and returns either 304 (not modified) or 200 with the new content.

Here’s a Node.js implementation:

const crypto = require('crypto');
const fs = require('fs').promises;

function generateETag(content) {
  return crypto
    .createHash('md5')
    .update(content)
    .digest('hex');
}

app.get('/api/articles/:id', async (req, res) => {
  try {
    const content = await fs.readFile(`./articles/${req.params.id}.json`, 'utf8');
    const etag = generateETag(content);
    
    // Check if client has current version
    const clientETag = req.headers['if-none-match'];
    
    if (clientETag === etag) {
      // Content hasn't changed
      res.status(304).end();
      return;
    }
    
    // Content changed or first request
    res.setHeader('ETag', etag);
    res.setHeader('Cache-Control', 'public, max-age=0, must-revalidate');
    res.json(JSON.parse(content));
  } catch (error) {
    res.status(404).json({ error: 'Article not found' });
  }
});

This approach saves bandwidth significantly. A 304 response contains only headers, while a 200 response includes the entire resource body. For large resources, this difference matters.

Last-Modified: Time-Based Validation

The Last-Modified header indicates when a resource was last changed. Browsers use this with the If-Modified-Since request header for validation, similar to ETags but based on timestamps rather than content hashes.

The limitation is precision—HTTP dates only have second-level granularity. If your content changes multiple times per second, Last-Modified won’t detect all changes. ETags handle this better.

However, Last-Modified is simpler to implement for file-based resources since you can use the filesystem’s modification time directly:

const fs = require('fs').promises;

app.get('/documents/:filename', async (req, res) => {
  const filepath = `./documents/${req.params.filename}`;
  
  try {
    const stats = await fs.stat(filepath);
    const lastModified = stats.mtime.toUTCString();
    
    // Check if client has current version
    const ifModifiedSince = req.headers['if-modified-since'];
    
    if (ifModifiedSince && new Date(ifModifiedSince) >= stats.mtime) {
      res.status(304).end();
      return;
    }
    
    const content = await fs.readFile(filepath);
    
    res.setHeader('Last-Modified', lastModified);
    res.setHeader('Cache-Control', 'public, max-age=3600');
    res.send(content);
  } catch (error) {
    res.status(404).send('File not found');
  }
});

Use Last-Modified for file-based resources where the filesystem already tracks modification times. Use ETags for dynamically generated content or when you need sub-second precision.

Combining Caching Strategies

The most effective caching implementations combine multiple strategies. Use Cache-Control to define caching duration and validation requirements, then add ETags or Last-Modified for efficient validation.

Different resource types warrant different strategies:

  • HTML pages: Short max-age (5-10 minutes) with validation headers
  • CSS/JS with content hashing: Long max-age (1 year) with immutable
  • CSS/JS without hashing: Medium max-age (1 hour) with ETags
  • Images: Long max-age (1 month to 1 year) depending on update frequency
  • API responses: no-cache or short max-age with ETags for validation

Here’s a complete setup for a REST API with versioned static assets:

const express = require('express');
const crypto = require('crypto');
const app = express();

// Middleware to add ETag to JSON responses
app.use((req, res, next) => {
  const originalJson = res.json.bind(res);
  res.json = function(data) {
    const content = JSON.stringify(data);
    const etag = crypto.createHash('md5').update(content).digest('hex');
    
    if (req.headers['if-none-match'] === etag) {
      return res.status(304).end();
    }
    
    res.setHeader('ETag', etag);
    return originalJson(data);
  };
  next();
});

// Versioned static assets - cache forever
app.use('/assets', (req, res, next) => {
  res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
  next();
}, express.static('dist'));

// API endpoints - validate every time but allow caching
app.get('/api/products', (req, res) => {
  res.setHeader('Cache-Control', 'public, max-age=60, must-revalidate');
  res.json([
    { id: 1, name: 'Product A', price: 29.99 },
    { id: 2, name: 'Product B', price: 39.99 }
  ]);
});

// User-specific data - private cache with validation
app.get('/api/profile', (req, res) => {
  res.setHeader('Cache-Control', 'private, max-age=300, must-revalidate');
  res.json({ userId: 123, name: 'Jane Smith' });
});

Cache invalidation remains challenging. The best approach is versioning—change the URL when content changes. For assets, use content hashing in filenames. For APIs, consider version prefixes like /api/v2/products.

Testing and Debugging Cache Behavior

Browser DevTools make cache inspection straightforward. Open the Network tab, reload the page, and examine the Size column—cached resources show “disk cache” or “memory cache” instead of transfer sizes. The Headers tab displays all caching headers for each request.

Test cache validation by making a request, then reloading. Watch for 304 responses indicating successful validation. If you see 200 responses when expecting 304, check that ETags or Last-Modified values match.

Here’s how to inspect caching programmatically:

// Browser console - inspect cache headers
fetch('/api/products')
  .then(response => {
    console.log('Cache-Control:', response.headers.get('Cache-Control'));
    console.log('ETag:', response.headers.get('ETag'));
    console.log('Last-Modified:', response.headers.get('Last-Modified'));
    return response.json();
  })
  .then(data => console.log(data));

Use curl for server-side testing:

# Initial request - get ETag
curl -I https://api.example.com/products

# Conditional request with ETag
curl -H "If-None-Match: \"abc123\"" -I https://api.example.com/products

# Conditional request with Last-Modified
curl -H "If-Modified-Since: Wed, 21 Oct 2024 07:28:00 GMT" -I https://api.example.com/products

Common pitfalls include forgetting to set Vary headers when content varies by request headers, using no-cache when you mean no-store, and setting max-age too high without validation headers, making updates difficult to deploy.

Conclusion

HTTP caching is a force multiplier for web performance. Cache-Control defines your caching policy, ETags provide precise content-based validation, and Last-Modified offers simpler time-based validation. Use them together for maximum effectiveness.

Quick reference for implementation:

Resource Type Cache-Control Validation Rationale
Versioned assets max-age=31536000, immutable None needed Content never changes
HTML pages max-age=300, must-revalidate ETag Balance freshness and performance
API responses private, max-age=60 ETag Short cache with validation
User data private, no-cache ETag Always validate, but cache
Sensitive data no-store None Never cache

The performance impact is measurable and significant. Proper caching reduces server load by 60-90% for static assets and 30-50% for API responses. Users experience faster page loads, lower data usage, and better offline resilience.

Start with conservative caching policies and increase cache duration as you gain confidence. Monitor cache hit rates and adjust based on your application’s specific patterns. The investment in proper caching configuration pays dividends in performance, scalability, and user satisfaction.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.