HTTP Caching: Cache-Control, ETag, and Last-Modified
HTTP caching is one of the most effective performance optimizations you can implement, yet it's frequently misconfigured or ignored entirely. Proper caching reduces server load, decreases bandwidth...
Key Insights
- Cache-Control directives determine how long browsers and CDNs store resources, with
max-agesetting expiration time andno-storepreventing caching entirely—choose based on content mutability. - ETags enable efficient validation by generating content fingerprints, allowing servers to return 304 Not Modified responses when content hasn’t changed, saving bandwidth without sacrificing freshness.
- Combining Cache-Control for expiration with ETags or Last-Modified for validation creates a two-tier caching strategy that balances performance with content accuracy across different resource types.
Introduction to HTTP Caching
HTTP caching is one of the most effective performance optimizations you can implement, yet it’s frequently misconfigured or ignored entirely. Proper caching reduces server load, decreases bandwidth consumption, and dramatically improves user experience by serving resources instantly from local storage.
The performance gains are substantial. A cached resource loads in milliseconds instead of hundreds of milliseconds or seconds. For users on slow connections or mobile networks, caching can mean the difference between a usable application and an unusable one.
Three primary mechanisms control HTTP caching: Cache-Control headers define caching policies, ETags provide content-based validation, and Last-Modified headers enable time-based validation. Understanding when and how to use each is essential for building performant web applications.
Browser caches store resources locally on the user’s device, while CDN caches store resources on edge servers distributed globally. Both use the same HTTP caching headers, but their strategic purposes differ—browser caches optimize for individual users, while CDN caches optimize for geographic distribution and reduced origin server load.
Cache-Control Header Fundamentals
The Cache-Control header is your primary tool for defining caching behavior. It uses directives that tell browsers and intermediary caches how to handle a resource.
The most important directives:
max-age=<seconds>: Specifies how long a resource remains freshno-cache: Forces validation with the origin server before using cached contentno-store: Prevents any caching whatsoeverpublic: Allows shared caches (CDNs) to store the resourceprivate: Restricts caching to the browser onlyimmutable: Indicates the resource will never change (useful for versioned assets)
Here’s how to implement different caching strategies in Express.js:
const express = require('express');
const app = express();
// Static assets with versioned filenames - cache aggressively
app.use('/static', (req, res, next) => {
// Assets like app.a3f2b1.js should be cached forever
if (req.url.match(/\.(js|css|png|jpg|jpeg|gif|svg|woff2)$/)) {
res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
}
next();
}, express.static('public'));
// API responses - don't cache or validate every time
app.get('/api/user/:id', (req, res) => {
res.setHeader('Cache-Control', 'private, no-cache');
// Fetch and return user data
res.json({ id: req.params.id, name: 'John Doe' });
});
// Sensitive data - never cache
app.get('/api/account/balance', (req, res) => {
res.setHeader('Cache-Control', 'no-store');
res.json({ balance: 1000.50 });
});
// HTML pages - short cache with validation
app.get('/', (req, res) => {
res.setHeader('Cache-Control', 'public, max-age=300, must-revalidate');
res.sendFile(__dirname + '/index.html');
});
Use max-age for resources that change infrequently. Use no-cache when you want caching but need validation on every request. Use no-store for sensitive data that should never be cached. The immutable directive is perfect for content-hashed assets that will literally never change.
ETag: Content-Based Validation
ETags (entity tags) are identifiers representing a specific version of a resource. When content changes, the ETag changes. This enables efficient validation—the client sends the ETag back to the server, and the server returns 304 Not Modified if the content hasn’t changed, avoiding a full response body.
Strong ETags (default) indicate byte-for-byte identical resources. Weak ETags (prefixed with W/) indicate semantic equivalence but allow minor differences like whitespace changes.
The validation flow works like this: the server sends an ETag with the initial response, the browser stores it, and on subsequent requests, the browser sends an If-None-Match header containing the ETag. The server compares it with the current resource’s ETag and returns either 304 (not modified) or 200 with the new content.
Here’s a Node.js implementation:
const crypto = require('crypto');
const fs = require('fs').promises;
function generateETag(content) {
return crypto
.createHash('md5')
.update(content)
.digest('hex');
}
app.get('/api/articles/:id', async (req, res) => {
try {
const content = await fs.readFile(`./articles/${req.params.id}.json`, 'utf8');
const etag = generateETag(content);
// Check if client has current version
const clientETag = req.headers['if-none-match'];
if (clientETag === etag) {
// Content hasn't changed
res.status(304).end();
return;
}
// Content changed or first request
res.setHeader('ETag', etag);
res.setHeader('Cache-Control', 'public, max-age=0, must-revalidate');
res.json(JSON.parse(content));
} catch (error) {
res.status(404).json({ error: 'Article not found' });
}
});
This approach saves bandwidth significantly. A 304 response contains only headers, while a 200 response includes the entire resource body. For large resources, this difference matters.
Last-Modified: Time-Based Validation
The Last-Modified header indicates when a resource was last changed. Browsers use this with the If-Modified-Since request header for validation, similar to ETags but based on timestamps rather than content hashes.
The limitation is precision—HTTP dates only have second-level granularity. If your content changes multiple times per second, Last-Modified won’t detect all changes. ETags handle this better.
However, Last-Modified is simpler to implement for file-based resources since you can use the filesystem’s modification time directly:
const fs = require('fs').promises;
app.get('/documents/:filename', async (req, res) => {
const filepath = `./documents/${req.params.filename}`;
try {
const stats = await fs.stat(filepath);
const lastModified = stats.mtime.toUTCString();
// Check if client has current version
const ifModifiedSince = req.headers['if-modified-since'];
if (ifModifiedSince && new Date(ifModifiedSince) >= stats.mtime) {
res.status(304).end();
return;
}
const content = await fs.readFile(filepath);
res.setHeader('Last-Modified', lastModified);
res.setHeader('Cache-Control', 'public, max-age=3600');
res.send(content);
} catch (error) {
res.status(404).send('File not found');
}
});
Use Last-Modified for file-based resources where the filesystem already tracks modification times. Use ETags for dynamically generated content or when you need sub-second precision.
Combining Caching Strategies
The most effective caching implementations combine multiple strategies. Use Cache-Control to define caching duration and validation requirements, then add ETags or Last-Modified for efficient validation.
Different resource types warrant different strategies:
- HTML pages: Short
max-age(5-10 minutes) with validation headers - CSS/JS with content hashing: Long
max-age(1 year) withimmutable - CSS/JS without hashing: Medium
max-age(1 hour) with ETags - Images: Long
max-age(1 month to 1 year) depending on update frequency - API responses:
no-cacheor shortmax-agewith ETags for validation
Here’s a complete setup for a REST API with versioned static assets:
const express = require('express');
const crypto = require('crypto');
const app = express();
// Middleware to add ETag to JSON responses
app.use((req, res, next) => {
const originalJson = res.json.bind(res);
res.json = function(data) {
const content = JSON.stringify(data);
const etag = crypto.createHash('md5').update(content).digest('hex');
if (req.headers['if-none-match'] === etag) {
return res.status(304).end();
}
res.setHeader('ETag', etag);
return originalJson(data);
};
next();
});
// Versioned static assets - cache forever
app.use('/assets', (req, res, next) => {
res.setHeader('Cache-Control', 'public, max-age=31536000, immutable');
next();
}, express.static('dist'));
// API endpoints - validate every time but allow caching
app.get('/api/products', (req, res) => {
res.setHeader('Cache-Control', 'public, max-age=60, must-revalidate');
res.json([
{ id: 1, name: 'Product A', price: 29.99 },
{ id: 2, name: 'Product B', price: 39.99 }
]);
});
// User-specific data - private cache with validation
app.get('/api/profile', (req, res) => {
res.setHeader('Cache-Control', 'private, max-age=300, must-revalidate');
res.json({ userId: 123, name: 'Jane Smith' });
});
Cache invalidation remains challenging. The best approach is versioning—change the URL when content changes. For assets, use content hashing in filenames. For APIs, consider version prefixes like /api/v2/products.
Testing and Debugging Cache Behavior
Browser DevTools make cache inspection straightforward. Open the Network tab, reload the page, and examine the Size column—cached resources show “disk cache” or “memory cache” instead of transfer sizes. The Headers tab displays all caching headers for each request.
Test cache validation by making a request, then reloading. Watch for 304 responses indicating successful validation. If you see 200 responses when expecting 304, check that ETags or Last-Modified values match.
Here’s how to inspect caching programmatically:
// Browser console - inspect cache headers
fetch('/api/products')
.then(response => {
console.log('Cache-Control:', response.headers.get('Cache-Control'));
console.log('ETag:', response.headers.get('ETag'));
console.log('Last-Modified:', response.headers.get('Last-Modified'));
return response.json();
})
.then(data => console.log(data));
Use curl for server-side testing:
# Initial request - get ETag
curl -I https://api.example.com/products
# Conditional request with ETag
curl -H "If-None-Match: \"abc123\"" -I https://api.example.com/products
# Conditional request with Last-Modified
curl -H "If-Modified-Since: Wed, 21 Oct 2024 07:28:00 GMT" -I https://api.example.com/products
Common pitfalls include forgetting to set Vary headers when content varies by request headers, using no-cache when you mean no-store, and setting max-age too high without validation headers, making updates difficult to deploy.
Conclusion
HTTP caching is a force multiplier for web performance. Cache-Control defines your caching policy, ETags provide precise content-based validation, and Last-Modified offers simpler time-based validation. Use them together for maximum effectiveness.
Quick reference for implementation:
| Resource Type | Cache-Control | Validation | Rationale |
|---|---|---|---|
| Versioned assets | max-age=31536000, immutable |
None needed | Content never changes |
| HTML pages | max-age=300, must-revalidate |
ETag | Balance freshness and performance |
| API responses | private, max-age=60 |
ETag | Short cache with validation |
| User data | private, no-cache |
ETag | Always validate, but cache |
| Sensitive data | no-store |
None | Never cache |
The performance impact is measurable and significant. Proper caching reduces server load by 60-90% for static assets and 30-50% for API responses. Users experience faster page loads, lower data usage, and better offline resilience.
Start with conservative caching policies and increase cache duration as you gain confidence. Monitor cache hit rates and adjust based on your application’s specific patterns. The investment in proper caching configuration pays dividends in performance, scalability, and user satisfaction.