Docker has revolutionized software packaging and deployment by introducing lightweight containers built from layers of filesystems. However, inefficient Dockerfile design can lead to bloated images, longer build times, and cache invalidation issues.

In this post, we’ll explore how Docker’s layered filesystem works, and share practical tips to write efficient Dockerfiles that result in smaller, faster, and reusable container images — perfect for production-grade pipelines.


What is Docker’s Layered Filesystem?

Every Docker image is composed of a stack of read-only layers, with a writable container layer added on top when a container is started.

Each instruction in a Dockerfile (like RUN, COPY, ADD) creates a new image layer. These layers are:

  • Immutable: Once created, a layer does not change
  • Cached: Reused across builds if the layer instructions haven’t changed
  • Shared: Between images and containers to save disk space

Example layer stack:

ubuntu:20.04  
├── Layer 1: FROM ubuntu:20.04  
├── Layer 2: RUN apt-get update  
├── Layer 3: RUN apt-get install -y python3  
├── Layer 4: COPY app/ /usr/src/app  
├── Layer 5: RUN pip install -r requirements.txt  

How Docker Builds Images

When you build a Dockerfile:

  1. Docker evaluates each line sequentially
  2. It checks for a cached layer based on:
    • Instruction type
    • Contents of files involved
  3. If unchanged, it reuses the layer
  4. If changed, it invalidates the cache for that instruction and all layers that follow

Tip: Order your instructions from least to most frequently changed.


Optimizing Your Dockerfile: Best Practices

1. Minimize Layers

Combine commands using && and \ to reduce layers:

RUN apt-get update && \
apt-get install -y curl git python3
2. Leverage Layer Caching

Move frequently-changing instructions (like COPY . .) to the bottom:

COPY requirements.txt .  
RUN pip install -r requirements.txt  
COPY . .  # <- This should come after installing dependencies
3. Use .dockerignore File

Exclude files that don’t need to be copied (e.g., .git, node_modules, *.log) to speed up context transfer and cache validity.

.git  
*.log  
__pycache__/
.env  
node_modules/
4. Choose Lightweight Base Images

Use minimal images like Alpine or Distroless:

FROM python:3.11-alpine
# or
FROM gcr.io/distroless/python3

This can cut image sizes by 70–90%.


Cache breaks when:

  • File contents change
  • Order of Dockerfile instructions changes
  • ENV variables or ARGs change
  • Dependencies (like pip packages) change

Tip: Use checksum-based install practices to retain cache:

COPY requirements.txt .  
RUN pip install --no-cache-dir -r requirements.txt

Multi-Stage Builds for Clean Final Images

Multi-stage builds allow you to compile code in one image, and copy only the artifacts into a slim final image:

# Stage 1: Builder
FROM golang:1.20 AS builder
WORKDIR /app
COPY . .
RUN go build -o app .

# Stage 2: Final
FROM alpine
COPY --from=builder /app/app /usr/bin/app
CMD ["app"]

This removes dev dependencies and reduces size.


Visualizing Layers

Use the dive tool to inspect layers and identify waste:

brew install dive
dive my-app:latest

This shows you which layers add the most size and which files can be removed or merged.


Summary of Tips

Tip Benefit
Combine RUN commands Fewer layers, smaller image
Order COPY/ADD logically Better caching
Use .dockerignore Smaller build context
Base on Alpine/Distroless Lightweight image
Multi-stage builds Clean, production-ready images
Inspect with dive Visualize layer inefficiencies

Conclusion

Understanding Docker’s layered filesystem unlocks the key to efficient and maintainable container builds. By structuring your Dockerfiles strategically and leveraging multi-stage builds, cache layering, and slim base images, you can deliver faster pipelines, smaller images, and more reliable deployments.

Adopt these practices today and optimize your containers for the future of cloud-native development.