Cache miss for the first stage of multi-stage build
See original GitHub issueBehaviour
Steps to reproduce this issue
- Fork the following repository: https://github.com/sagikazarmark/docker-cache-arg
- Observe CI building the project and saving cache
- Add an empty commit to HEAD:
git ci -m 'First empty' --allow-empty && git push
- Observe CI building the project, loading cache (from the previous commit) but NOT using it for the first stage and using it for the second
- Remove lines 73 and 74 (COMMIT_HASH and BUILD_DATE) from
.github/workflows/ci.yml
- Observe CI building the project, loading cache (from the previous commit) but NOT using it for the first stage and using it for the second
- Add another empty commit to HEAD:
git ci -m 'Second empty' --allow-empty && git push
- Observe CI building the project, loading cache (from the previous commit) and PROPERLY using it for both stages
- Add an empty file to the repo:
touch test && git ci -m 'Add test file' && git push
- Observe CI building the project, loading cache (from the previous commit) but NOT using it for the first stage and using it for the second
Alternatively, examine the same behavior on this branch: https://github.com/sagikazarmark/docker-cache-arg/actions?query=branch%3Atest-branch3
Expected behaviour
The build should successfully utilize the cache when applicable.
Actual behaviour
The first stage doesn’t use cache at all in certain cases (see below).
Configuration
See the above linked repository
Logs
See the above linked repository
Details
I’ve spent 6 hours debugging this issue and I can’t crack it. There is a good chance that either I’m screwing up something or this is not an issue with the action itself, but (obviously) I can’t reproduce it locally (Docker for Mac + buildx + docker-container builder), so here it goes:
Although the title suggests that the cache is not being used for the first stage, there are actually multiple factors at play here:
First of all, when I talk about the first stage receiving a cache miss, I mean the entire stage. Not just steps susceptible to build args or file copying, everything is rebuilt. For reference, here is the Dockerfile:
ARG GO_VERSION=1.15
ARG FROM_IMAGE=scratch
FROM golang:${GO_VERSION}-alpine3.12 AS builder
# set up nsswitch.conf for Go's "netgo" implementation
# https://github.com/gliderlabs/docker-alpine/issues/367#issuecomment-424546457
RUN echo 'hosts: files dns' > /etc/nsswitch.conf.build
RUN apk add --update --no-cache bash ca-certificates make curl git mercurial tzdata
ENV GOFLAGS="-mod=readonly"
ARG GOPROXY
RUN mkdir -p /build
WORKDIR /build
COPY go.* /build/
RUN go mod download
ARG VERSION
ARG COMMIT_HASH
ARG BUILD_DATE
COPY . /build
RUN go build -o /build/hello
FROM ${FROM_IMAGE}
COPY --from=builder /build/hello /
CMD ["/hello"]
Now, if any file, part of the Docker context changes, the first stage receives a cache miss (proven by items 9 and 10 on the above list). As you can see, this should only affect the COPY . /build
step, nothing before that.
Furthermore, if any of the build arguments change, the entire stage is rebuilt. (COMMIT_HASH
and BUILD_DATE
change with every commit, removing them resolves the issue) Again, there are several steps before those build arguments which could easily come from cache.
Looking at the caching steps: they properly save and restore caches, Docker is able to load the cache (item 8 on the above list proves that).
It’s also interesting to see, that further stages are properly loaded from the cache. In a different project (where I first noticed this issue) several other stages were also properly loaded from the cache, even when COPY
ing something from the first stage that changed (the steps before that COPY
were loaded from the cache, the rest were rebuilt as expected).
When trying to reproduce locally, I used the exact same commands for building the image (except for iidfile and the secret) and caching worked as expected.
To summarize:
- The first stage is completely rebuilt in certain cases (build arg change, file change)
- The rest of the stages are loaded from cache as expected
- Only the first stage is affected
- Cannot reproduce this issue locally
Again, I realize this might not be a problem with this action, but the fact that I cannot reproduce this locally suggests that this might be environmental after all.
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top GitHub Comments
Actually, I think it would make sense to mention the solution somewhere. When using multi-stage builds (which I think is fairly common these days),
mode=max
has to be added to the cache options, otherwise only the last stage is cached (as pointed out in the above issue).Do you think it would make sense to mention this somewhere @crazy-max ?
@sagikazarmark
Yes I will add a note in this section.
Do you have a local repro? Are you aware of this kind of issue with local cache @tonistiigi?