Immutable Infrastructure: Replace Not Repair

Key Insights

Immutable infrastructure eliminates configuration drift by treating servers as disposable units that are replaced entirely rather than patched in place, reducing the “works on my machine” problem to “works with this image”
The shift requires rethinking your entire deployment pipeline—from image building with Packer or Docker to blue-green deployments—but pays dividends in reproducibility and rollback speed
You’ll trade SSH access for centralized logging and observability, which feels constraining at first but forces better architectural practices and actually improves debugging at scale

The Traditional vs. Immutable Paradigm

Traditional infrastructure management is like maintaining a classic car. You patch the OS, tweak configuration files, install dependencies, and hope nothing breaks. Over months, your production server diverges from staging. You’ve got snowflake servers that nobody dares touch because “it just works” and recreating it would be a nightmare.

Here’s what that looks like in practice:

# Traditional mutable approach
ssh prod-server-01
sudo apt-get update
sudo apt-get upgrade nginx
sudo vim /etc/nginx/nginx.conf
sudo systemctl restart nginx
# Pray it works, repeat for 12 more servers

Immutable infrastructure flips this model. Servers become cattle, not pets. When you need to update nginx, you don’t patch the existing server—you build a new image with the updated nginx, deploy fresh instances, and terminate the old ones.

# Immutable approach
packer build nginx-v2.json
terraform apply -var="ami_id=ami-new123"
# All servers replaced with identical new instances

The difference is philosophical. In the mutable world, your infrastructure’s current state is the sum of every change ever made. In the immutable world, your infrastructure’s state is explicitly defined in version-controlled code.

Core Principles of Immutable Infrastructure

Immutable infrastructure rests on four pillars:

Never modify deployed infrastructure. Once a server boots, it runs unchanged until termination. No SSH sessions to tweak configs. No emergency patches. If something needs changing, you build a new image and deploy it.

All changes flow through version control. Your infrastructure definitions live in Git alongside application code. Want to know why production differs from staging? Check the commit history.

Automated provisioning is mandatory. Manual steps don’t scale and break immutability. Everything from image building to deployment must be automated.

Versioning is first-class. Every artifact—AMIs, Docker images, Helm charts—gets a version tag. Rollbacks become terraform apply -var="version=v1.2.3".

Here’s how this looks with Terraform and versioned AMIs:

# infrastructure/main.tf
variable "app_version" {
  description = "Application version to deploy"
  default     = "v1.3.0"
}

data "aws_ami" "app" {
  most_recent = true
  owners      = ["self"]
  
  filter {
    name   = "name"
    values = ["myapp-${var.app_version}"]
  }
  
  filter {
    name   = "tag:Immutable"
    values = ["true"]
  }
}

resource "aws_launch_template" "app" {
  name_prefix   = "myapp-"
  image_id      = data.aws_ami.app.id
  instance_type = "t3.medium"
  
  tag_specifications {
    resource_type = "instance"
    tags = {
      Version   = var.app_version
      Immutable = "true"
    }
  }
}

resource "aws_autoscaling_group" "app" {
  desired_capacity = 3
  max_size         = 5
  min_size         = 2
  
  launch_template {
    id      = aws_launch_template.app.id
    version = "$Latest"
  }
}

Deploying a new version is just terraform apply -var="app_version=v1.4.0". Terraform handles the orchestration.

Building Immutable Images

The foundation of immutable infrastructure is the image. You have two main approaches: baking (pre-installing everything) and bootstrapping (minimal image + runtime configuration). Baking is slower to build but faster to deploy. For immutable infrastructure, baking wins—it’s more deterministic.

Here’s a Packer template that bakes an application into an AMI:

{
  "builders": [{
    "type": "amazon-ebs",
    "region": "us-east-1",
    "source_ami_filter": {
      "filters": {
        "name": "ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"
      },
      "most_recent": true,
      "owners": ["099720109477"]
    },
    "instance_type": "t3.medium",
    "ssh_username": "ubuntu",
    "ami_name": "myapp-{{timestamp}}",
    "tags": {
      "Version": "{{user `app_version`}}",
      "Immutable": "true"
    }
  }],
  "provisioners": [
    {
      "type": "shell",
      "inline": [
        "sudo apt-get update",
        "sudo apt-get install -y docker.io",
        "sudo systemctl enable docker"
      ]
    },
    {
      "type": "file",
      "source": "app/",
      "destination": "/tmp/app"
    },
    {
      "type": "shell",
      "inline": [
        "cd /tmp/app",
        "sudo docker build -t myapp:{{user `app_version`}} .",
        "sudo docker save myapp:{{user `app_version`}} -o /opt/myapp.tar"
      ]
    }
  ]
}

For containerized workloads, multi-stage Docker builds create lean, immutable images:

# Build stage
FROM node:18-alpine AS builder
WORKDIR /build
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001
COPY --from=builder --chown=nodejs:nodejs /build/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /build/node_modules ./node_modules
USER nodejs
EXPOSE 3000
CMD ["node", "dist/index.js"]

Tag images with semantic versions and Git SHAs: myapp:v1.3.0-abc123def. This creates an audit trail from production back to source code.

Deployment Strategies

Immutable infrastructure enables sophisticated deployment patterns. Blue-green deployments are the simplest: run two identical environments, switch traffic from blue to green, keep blue warm for instant rollback.

Kubernetes makes rolling updates natural:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2        # Never more than 8 pods total
      maxUnavailable: 1  # Never fewer than 5 pods running
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: v1.4.0
    spec:
      containers:
      - name: myapp
        image: myregistry.io/myapp:v1.4.0
        ports:
        - containerPort: 3000
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 3

When you update the image tag, Kubernetes gradually replaces pods. The readiness probe ensures new pods are healthy before receiving traffic. If something fails, kubectl rollout undo deployment/myapp reverts instantly.

Handling State and Configuration

The hardest part of immutable infrastructure is state. Databases, uploaded files, and user sessions don’t fit the “replace everything” model.

The solution is separation. Stateful components live outside your immutable infrastructure:

# docker-compose.yml
version: '3.8'
services:
  app:
    image: myapp:${APP_VERSION}
    environment:
      - DATABASE_URL=postgresql://db:5432/myapp
      - REDIS_URL=redis://cache:6379
    depends_on:
      - db
      - cache
  
  db:
    image: postgres:15
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB=myapp
  
  cache:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Configuration comes from environment variables or external systems. In Kubernetes:

apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
data:
  LOG_LEVEL: "info"
  FEATURE_FLAGS: "new_ui,beta_api"
---
apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
type: Opaque
data:
  DATABASE_PASSWORD: cGFzc3dvcmQxMjM=  # base64 encoded
  API_KEY: YWJjZGVmZ2hpams=
---
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: myapp
        envFrom:
        - configMapRef:
            name: myapp-config
        - secretRef:
            name: myapp-secrets

Database migrations run as init containers or separate jobs, never baked into images.

Observability and Debugging

Without SSH access, observability becomes critical. You need centralized logging, metrics, and tracing.

Structured logging is non-negotiable:

// app/logger.js
const winston = require('winston');

const logger = winston.createLogger({
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.json()
  ),
  defaultMeta: { 
    service: 'myapp',
    version: process.env.APP_VERSION,
    instance: process.env.HOSTNAME
  },
  transports: [
    new winston.transports.Console()
  ]
});

module.exports = logger;

// Usage
logger.info('Request processed', {
  requestId: req.id,
  userId: req.user.id,
  duration: elapsed,
  statusCode: res.statusCode
});

Logs flow to centralized systems like Loki or Elasticsearch. You debug by querying logs, not by SSH-ing into servers.

Real-World Tradeoffs and When to Use

Immutable infrastructure isn’t free. Building images takes time. Storage costs multiply when you’re versioning AMIs. Small teams might find the overhead excessive.

Use immutable infrastructure when:

You’re running distributed systems at scale
Compliance requires audit trails
You need reliable rollbacks
Configuration drift has bitten you

Skip it when:

You’re a three-person startup
Your infrastructure is a single VPS
Deployment frequency is monthly

The sweet spot is mid-sized teams running containerized applications on Kubernetes or serverless platforms where immutability is built-in.

Immutable infrastructure forces discipline. You can’t cowboy your way out of problems with quick SSH fixes. But that discipline creates systems that are predictable, reproducible, and resilient. The cattle-not-pets philosophy scales in ways that artisanal server maintenance never could.