Immutable Infrastructure: Replace Not Repair
Traditional infrastructure management is like maintaining a classic car. You patch the OS, tweak configuration files, install dependencies, and hope nothing breaks. Over months, your production...
Key Insights
- Immutable infrastructure eliminates configuration drift by treating servers as disposable units that are replaced entirely rather than patched in place, reducing the “works on my machine” problem to “works with this image”
- The shift requires rethinking your entire deployment pipeline—from image building with Packer or Docker to blue-green deployments—but pays dividends in reproducibility and rollback speed
- You’ll trade SSH access for centralized logging and observability, which feels constraining at first but forces better architectural practices and actually improves debugging at scale
The Traditional vs. Immutable Paradigm
Traditional infrastructure management is like maintaining a classic car. You patch the OS, tweak configuration files, install dependencies, and hope nothing breaks. Over months, your production server diverges from staging. You’ve got snowflake servers that nobody dares touch because “it just works” and recreating it would be a nightmare.
Here’s what that looks like in practice:
# Traditional mutable approach
ssh prod-server-01
sudo apt-get update
sudo apt-get upgrade nginx
sudo vim /etc/nginx/nginx.conf
sudo systemctl restart nginx
# Pray it works, repeat for 12 more servers
Immutable infrastructure flips this model. Servers become cattle, not pets. When you need to update nginx, you don’t patch the existing server—you build a new image with the updated nginx, deploy fresh instances, and terminate the old ones.
# Immutable approach
packer build nginx-v2.json
terraform apply -var="ami_id=ami-new123"
# All servers replaced with identical new instances
The difference is philosophical. In the mutable world, your infrastructure’s current state is the sum of every change ever made. In the immutable world, your infrastructure’s state is explicitly defined in version-controlled code.
Core Principles of Immutable Infrastructure
Immutable infrastructure rests on four pillars:
Never modify deployed infrastructure. Once a server boots, it runs unchanged until termination. No SSH sessions to tweak configs. No emergency patches. If something needs changing, you build a new image and deploy it.
All changes flow through version control. Your infrastructure definitions live in Git alongside application code. Want to know why production differs from staging? Check the commit history.
Automated provisioning is mandatory. Manual steps don’t scale and break immutability. Everything from image building to deployment must be automated.
Versioning is first-class. Every artifact—AMIs, Docker images, Helm charts—gets a version tag. Rollbacks become terraform apply -var="version=v1.2.3".
Here’s how this looks with Terraform and versioned AMIs:
# infrastructure/main.tf
variable "app_version" {
description = "Application version to deploy"
default = "v1.3.0"
}
data "aws_ami" "app" {
most_recent = true
owners = ["self"]
filter {
name = "name"
values = ["myapp-${var.app_version}"]
}
filter {
name = "tag:Immutable"
values = ["true"]
}
}
resource "aws_launch_template" "app" {
name_prefix = "myapp-"
image_id = data.aws_ami.app.id
instance_type = "t3.medium"
tag_specifications {
resource_type = "instance"
tags = {
Version = var.app_version
Immutable = "true"
}
}
}
resource "aws_autoscaling_group" "app" {
desired_capacity = 3
max_size = 5
min_size = 2
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
}
Deploying a new version is just terraform apply -var="app_version=v1.4.0". Terraform handles the orchestration.
Building Immutable Images
The foundation of immutable infrastructure is the image. You have two main approaches: baking (pre-installing everything) and bootstrapping (minimal image + runtime configuration). Baking is slower to build but faster to deploy. For immutable infrastructure, baking wins—it’s more deterministic.
Here’s a Packer template that bakes an application into an AMI:
{
"builders": [{
"type": "amazon-ebs",
"region": "us-east-1",
"source_ami_filter": {
"filters": {
"name": "ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"
},
"most_recent": true,
"owners": ["099720109477"]
},
"instance_type": "t3.medium",
"ssh_username": "ubuntu",
"ami_name": "myapp-{{timestamp}}",
"tags": {
"Version": "{{user `app_version`}}",
"Immutable": "true"
}
}],
"provisioners": [
{
"type": "shell",
"inline": [
"sudo apt-get update",
"sudo apt-get install -y docker.io",
"sudo systemctl enable docker"
]
},
{
"type": "file",
"source": "app/",
"destination": "/tmp/app"
},
{
"type": "shell",
"inline": [
"cd /tmp/app",
"sudo docker build -t myapp:{{user `app_version`}} .",
"sudo docker save myapp:{{user `app_version`}} -o /opt/myapp.tar"
]
}
]
}
For containerized workloads, multi-stage Docker builds create lean, immutable images:
# Build stage
FROM node:18-alpine AS builder
WORKDIR /build
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:18-alpine
WORKDIR /app
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
COPY --from=builder --chown=nodejs:nodejs /build/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /build/node_modules ./node_modules
USER nodejs
EXPOSE 3000
CMD ["node", "dist/index.js"]
Tag images with semantic versions and Git SHAs: myapp:v1.3.0-abc123def. This creates an audit trail from production back to source code.
Deployment Strategies
Immutable infrastructure enables sophisticated deployment patterns. Blue-green deployments are the simplest: run two identical environments, switch traffic from blue to green, keep blue warm for instant rollback.
Kubernetes makes rolling updates natural:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 6
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2 # Never more than 8 pods total
maxUnavailable: 1 # Never fewer than 5 pods running
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v1.4.0
spec:
containers:
- name: myapp
image: myregistry.io/myapp:v1.4.0
ports:
- containerPort: 3000
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 3
When you update the image tag, Kubernetes gradually replaces pods. The readiness probe ensures new pods are healthy before receiving traffic. If something fails, kubectl rollout undo deployment/myapp reverts instantly.
Handling State and Configuration
The hardest part of immutable infrastructure is state. Databases, uploaded files, and user sessions don’t fit the “replace everything” model.
The solution is separation. Stateful components live outside your immutable infrastructure:
# docker-compose.yml
version: '3.8'
services:
app:
image: myapp:${APP_VERSION}
environment:
- DATABASE_URL=postgresql://db:5432/myapp
- REDIS_URL=redis://cache:6379
depends_on:
- db
- cache
db:
image: postgres:15
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_DB=myapp
cache:
image: redis:7-alpine
volumes:
- redis_data:/data
volumes:
postgres_data:
redis_data:
Configuration comes from environment variables or external systems. In Kubernetes:
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
data:
LOG_LEVEL: "info"
FEATURE_FLAGS: "new_ui,beta_api"
---
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
type: Opaque
data:
DATABASE_PASSWORD: cGFzc3dvcmQxMjM= # base64 encoded
API_KEY: YWJjZGVmZ2hpams=
---
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: myapp
envFrom:
- configMapRef:
name: myapp-config
- secretRef:
name: myapp-secrets
Database migrations run as init containers or separate jobs, never baked into images.
Observability and Debugging
Without SSH access, observability becomes critical. You need centralized logging, metrics, and tracing.
Structured logging is non-negotiable:
// app/logger.js
const winston = require('winston');
const logger = winston.createLogger({
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
defaultMeta: {
service: 'myapp',
version: process.env.APP_VERSION,
instance: process.env.HOSTNAME
},
transports: [
new winston.transports.Console()
]
});
module.exports = logger;
// Usage
logger.info('Request processed', {
requestId: req.id,
userId: req.user.id,
duration: elapsed,
statusCode: res.statusCode
});
Logs flow to centralized systems like Loki or Elasticsearch. You debug by querying logs, not by SSH-ing into servers.
Real-World Tradeoffs and When to Use
Immutable infrastructure isn’t free. Building images takes time. Storage costs multiply when you’re versioning AMIs. Small teams might find the overhead excessive.
Use immutable infrastructure when:
- You’re running distributed systems at scale
- Compliance requires audit trails
- You need reliable rollbacks
- Configuration drift has bitten you
Skip it when:
- You’re a three-person startup
- Your infrastructure is a single VPS
- Deployment frequency is monthly
The sweet spot is mid-sized teams running containerized applications on Kubernetes or serverless platforms where immutability is built-in.
Immutable infrastructure forces discipline. You can’t cowboy your way out of problems with quick SSH fixes. But that discipline creates systems that are predictable, reproducible, and resilient. The cattle-not-pets philosophy scales in ways that artisanal server maintenance never could.