Layer Caching in Docker Is a Big Deal and Most Devs Ignore It
by Arif Ikhsanudin, Backend Developer
Ten developers, forty wasted minutes a day
Multiply ten developers by four Docker builds each per day at two minutes a build and you get 80 developer-minutes daily on image builds. Multiply that by 250 working days and you're at 333 developer-hours annually — roughly eight developer-weeks — sitting at a terminal watching progress bars. This is a common situation on teams that have never deliberately optimized their Dockerfile layer order, and the fix takes an afternoon.
The mechanism at the center of this is Docker's layer cache. Understanding it properly changes how you write Dockerfiles.
What a layer is and when Docker caches it
Every instruction in a Dockerfile that modifies the filesystem creates a new layer: RUN, COPY, ADD. Each layer is identified by a hash of its instruction text and its inputs. Docker compares this hash against its cache before executing. On a hit, Docker reuses the cached layer — no execution, no I/O, near-instant. On a miss, Docker executes the instruction and creates a new layer, then invalidates the cache for all subsequent layers, because downstream layers may depend on what this one produced.
This cascading invalidation is the behavior most developers don't internalize. A single cache miss early in the Dockerfile forces every subsequent instruction to re-execute, even if nothing those instructions depend on has changed.
FROM python:3.12-slim
WORKDIR /app
COPY . . # <-- miss here on any source change
RUN pip install -r requirements.txt # <-- always re-runs, even if requirements unchanged
RUN python manage.py collectstatic # <-- always re-runs
Every time any source file changes — a comment in a view, a blank line in a model — pip reinstalls all dependencies from scratch. On a project with 150 Python packages, that's 90 seconds on every build.
Ordering layers by change frequency
The fix is simple to state: put instructions that change rarely near the top, instructions that change often near the bottom.
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt . # only changes when deps change
RUN pip install --no-cache-dir -r requirements.txt # cached until deps change
COPY . . # changes with every source edit
RUN python manage.py collectstatic
Now pip install is cached as long as requirements.txt doesn't change. A source edit only replays the last two instructions. The savings are proportional to how expensive the cached layer is — package installation, compilation, asset processing.
A taxonomy of layers by cache stability
Not all layers are equal. In practice, layers fall into rough stability tiers:
Highly stable (cache for weeks or months):
- Base image pulls —
FROM node:20-alpine - Global tool installation — system packages that your app runtime depends on
- Build tool installation — compilers, build utilities
Moderately stable (cache for days, invalidated by dep changes):
- Dependency manifest copy —
COPY package.json package-lock.json ./ - Dependency installation —
npm ci,pip install,mvn dependency:go-offline
Volatile (invalidated on nearly every build):
- Application source —
COPY src/ ./src/ - Generated artifacts — compiled output, static assets
The ideal Dockerfile mirrors this ordering: stable at top, volatile at bottom.
The gotcha with multi-module projects
In a Maven multi-module project, copying only pom.xml from the root isn't enough — child module POMs also need to be present before mvn dependency:go-offline can resolve the full dependency graph.
FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app
# Copy all POMs first to allow dependency resolution
COPY pom.xml .
COPY module-api/pom.xml module-api/
COPY module-core/pom.xml module-core/
COPY module-web/pom.xml module-web/
RUN mvn dependency:go-offline -q
# Now copy source
COPY module-api/src module-api/src
COPY module-core/src module-core/src
COPY module-web/src module-web/src
RUN mvn package -DskipTests -q
Verbose, but each module's source is its own layer. If only module-web changes, only its source layer and the final package step need to replay. The dependency layer stays cached.
BuildKit's cache mounts: persistent caches across builds
BuildKit adds a more powerful tool: --mount=type=cache. This mounts a persistent cache directory that survives between builds on the same machine, separate from the layer cache.
FROM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /app
COPY pom.xml .
RUN --mount=type=cache,target=/root/.m2 \
mvn dependency:go-offline -q
COPY src ./src
RUN --mount=type=cache,target=/root/.m2 \
mvn package -DskipTests -q
The Maven local repository at /root/.m2 is preserved between builds. Even when the layer cache misses (because pom.xml changed), Maven finds its previously downloaded JARs in the mounted cache and only downloads what's new. On a large project this reduces dependency resolution from 3 minutes to under 30 seconds even on a full cache miss.
The tradeoff: this cache is machine-local. Ephemeral CI runners won't benefit. For CI, use registry-based caching instead.
Cache keys and COPY precision
Docker computes the cache key for a COPY instruction using the checksum of all copied files. COPY . . checksums your entire working directory. If any file changes — even a README.md — the cache is invalidated.
Be precise with COPY:
# Broad — invalidated by README, .gitignore, test files, anything
COPY . .
# Precise — only invalidated when business logic changes
COPY src/main/ ./src/main/
This is particularly impactful when you have test directories, documentation, or tooling config that has no bearing on the runtime image. Copy only the source that the RUN instruction actually needs.
How to verify your cache is working
Run the build twice and observe the output. With classic Docker:
docker build . 2>&1 | grep -E "CACHED|Step"
Layers reported as CACHED were served from cache. If you're changing only a source file and see dependency installation steps without CACHED, your ordering is wrong.
With BuildKit's --progress=plain:
docker buildx build --progress=plain . 2>&1 | grep -E "CACHED|[0-9]+\.[0-9]+s"
Each step shows either CACHED or its execution time. The slow steps that aren't marked CACHED are your optimization targets.
The one thing to act on today
Open your team's most-used Dockerfile. Look at where the source code copy (COPY . . or similar) falls relative to the dependency installation step. If source is copied before dependencies are installed, swap them. Add a .dockerignore if it's missing. That's the change — it takes 10 minutes and the benefits accumulate on every build from that point forward.