Docker Multi-Stage Builds: Optimizing Image Size
Docker image size isn't just a vanity metric. Every megabyte in your image translates to real costs: slower CI/CD pipelines, increased registry storage fees, longer deployment times, and a larger...
Key Insights
- Multi-stage builds can reduce Docker image sizes by 80-95% by separating build-time dependencies from runtime requirements, directly impacting deployment speed and storage costs.
- Each FROM statement in a Dockerfile creates a new stage, allowing you to copy only necessary artifacts forward while discarding compilers, build tools, and source code.
- Choosing minimal base images like Alpine Linux or Google’s Distroless containers for your final stage, combined with proper layer ordering, maximizes both size reduction and build cache efficiency.
The Image Size Problem
Docker image size isn’t just a vanity metric. Every megabyte in your image translates to real costs: slower CI/CD pipelines, increased registry storage fees, longer deployment times, and a larger attack surface for security vulnerabilities. A typical Node.js application built in a single stage can easily balloon to 1GB+, while the actual application code might only be a few megabytes.
The traditional approach of building everything in one Dockerfile forces you to include compilers, build tools, development dependencies, and source files in your final image—even though none of these are needed at runtime. Multi-stage builds solve this elegantly by separating the build environment from the runtime environment.
Understanding Multi-Stage Builds
Multi-stage builds use multiple FROM statements in a single Dockerfile. Each FROM instruction begins a new stage, and you can selectively copy artifacts from previous stages using COPY --from. The crucial insight: only the final stage becomes your image. Everything else is discarded.
Here’s a simple comparison:
Traditional single-stage approach:
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
CMD ["node", "dist/index.js"]
Multi-stage approach:
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
RUN npm install --production
CMD ["node", "dist/index.js"]
The multi-stage version separates build dependencies from runtime dependencies. The final image contains only the compiled code and production dependencies, not the TypeScript compiler, dev tools, or source files.
Real-World Example: Building a Go Application
Go applications demonstrate multi-stage builds particularly well because Go compiles to a single binary with no runtime dependencies. Let’s build a simple HTTP server.
Single-stage Dockerfile (bloated):
FROM golang:1.21
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o server .
CMD ["./server"]
This creates an image around 1GB because it includes the entire Go toolchain, module cache, and source code.
Multi-stage Dockerfile (optimized):
# Build stage
FROM golang:1.21 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o server .
# Runtime stage
FROM alpine:3.19
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/server .
CMD ["./server"]
The difference is dramatic:
$ docker images
REPOSITORY TAG SIZE
go-app-single latest 1.05GB
go-app-multi latest 12.4MB
That’s a 98.8% reduction. The final image contains only the compiled binary and minimal Alpine Linux base—no compiler, no source code, no build cache.
Advanced Patterns and Techniques
Real applications often require multiple build tools. Here’s a more complex example with Node.js frontend and Python backend:
# Node.js build stage
FROM node:18 AS node-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build
# Python build stage
FROM python:3.11-slim AS python-builder
WORKDIR /app
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY backend/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Final runtime stage
FROM python:3.11-slim
WORKDIR /app
# Copy Python virtual environment
COPY --from=python-builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Copy built frontend assets
COPY --from=node-builder /app/frontend/dist ./static
# Copy backend application
COPY backend/ ./
EXPOSE 8000
CMD ["python", "main.py"]
Notice how we name stages with AS and reference them explicitly in COPY --from. You can also build specific stages as targets:
# Build only the node-builder stage for testing
docker build --target node-builder -t frontend-test .
This is invaluable for debugging and testing intermediate build steps.
Best Practices for Maximum Optimization
Choose the smallest viable base image. For the final stage, prefer:
alpinevariants (5-10MB base)- Google’s Distroless images (20-50MB, no shell or package manager)
scratchfor static binaries (0MB base)
Order layers for cache efficiency:
FROM golang:1.21 AS builder
WORKDIR /app
# Copy dependency files first (changes infrequently)
COPY go.mod go.sum ./
RUN go mod download
# Copy source code last (changes frequently)
COPY . .
RUN go build -o server .
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
ENTRYPOINT ["/server"]
Dependency files change less often than source code. By copying them first, Docker can cache the go mod download layer across builds.
Minimize layer count in final stage:
# Good: Combine commands to reduce layers
FROM alpine:3.19
RUN apk --no-cache add ca-certificates tzdata && \
addgroup -g 1000 app && \
adduser -D -u 1000 -G app app
USER app
Security consideration: Distroless images contain only your application and runtime dependencies—no shell, no package manager. This dramatically reduces the attack surface:
FROM golang:1.21 AS builder
WORKDIR /app
COPY . .
RUN go build -o app .
FROM gcr.io/distroless/base-debian12
COPY --from=builder /app/app /
USER nonroot:nonroot
CMD ["/app"]
Measuring and Comparing Results
Use docker history to see layer sizes:
$ docker history go-app-multi
IMAGE CREATED BY SIZE
<missing> CMD ["./server"] 0B
<missing> COPY /app/server . # buildkit 8.2MB
<missing> WORKDIR /root/ 0B
<missing> RUN /bin/sh -c apk --no-cache add ca-certif… 234kB
For detailed analysis, use docker image inspect:
$ docker image inspect go-app-multi --format='{{.Size}}' | numfmt --to=iec
12M
Here’s a typical before/after comparison:
| Metric | Single-Stage | Multi-Stage | Reduction |
|---|---|---|---|
| Image Size | 1.05GB | 12.4MB | 98.8% |
| Layers | 12 | 4 | 66.7% |
| Build Time | 45s | 52s | -15.6% |
| Push Time | 38s | 2s | 94.7% |
Build time may increase slightly due to multiple stages, but push/pull times improve dramatically—often the more important metric for CI/CD pipelines.
Common Pitfalls and Troubleshooting
Pitfall 1: Missing runtime dependencies
Your application builds fine but crashes at runtime because you forgot a system library:
# Wrong: Missing SSL certificates
FROM scratch
COPY --from=builder /app/server .
CMD ["./server"]
# Right: Include necessary runtime dependencies
FROM alpine:3.19
RUN apk --no-cache add ca-certificates
COPY --from=builder /app/server .
CMD ["./server"]
Pitfall 2: Copying unnecessary files
# Wrong: Copies entire build context
COPY --from=builder /app .
# Right: Copy only what's needed
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json .
Pitfall 3: Incorrect base image for compiled binaries
If you compile with CGO enabled but use scratch, you’ll get errors. Debug by inspecting intermediate stages:
# Build and run the builder stage directly
docker build --target builder -t debug-builder .
docker run -it debug-builder /bin/sh
# Inside container, check binary dependencies
ldd /app/server
If ldd shows dependencies, you need a base image with those libraries (like Alpine), or recompile with CGO_ENABLED=0.
Multi-stage builds are the single most effective technique for reducing Docker image sizes. Start with a simple two-stage build separating build and runtime, then optimize further with minimal base images and careful layer ordering. Your deployment pipeline will thank you.