Primary cache should be re-evaluated when deciding whether to update the cache in post-job phase
See original GitHub issueI have the following step in my Maven build job to cache the CVE database created by the dependency-check plugin as it can take a while to download from scratch:
- name: Cache CVE database for OWASP dependency-check
uses: actions/cache@v3
with:
path: dependency-check-data
key: "${{ runner.os }}-dependency-check-data-${{ hashFiles('**/nvdcve-1.1-modified.meta') }}"
restore-keys: |
${{ runner.os }}-dependency-check-data-${{ hashFiles('**/nvdcve-1.1-modified.meta') }}
${{ runner.os }}-dependency-check-data-
The file I use to determine whether the cache should be updated is dependency-check-data/nvdcve-1.1-modified.meta
. This file is only created during the build job, i.e. it is not part of the repository.
First run
Obviously no cache hit.
In the post-build step the cache is created correctly, but GitHub actions says:
Cache saved with key: Linux-dependency-check-database-
That’s not correct: it should save the cache with key Linux-dependency-check-database-e6dd7c26af2b2d399adc976c9e67d9ae0e4013e1ef99f753a41fea199a11109e
.
Full debug output of the post-build step: https://gist.github.com/fransf-wtax/4f1c5c2aa5a153807bcd145aee281b33
Second run
During cache restore i.e. before the build, the key will evaluate to Linux-dependency-check-data-
which is fine because the plugin should match on prefixes so the most recent cache created in previous builds would be used.
Full debug output of the build step: https://gist.github.com/fransf-wtax/58e6f867603f076f874b2b4ceff7c25f
But after the build, when the cache is updated, GitHub Actions says:
##[debug]Cache state/key: Linux-dependency-check-data-
Cache hit occurred on the primary key Linux-dependency-check-data-, not saving cache.
This is not correct. The file matching */nvdcve-1.1-modified.meta
now exists, so it should re-evaluate the key "${{ runner.os }}-dependency-check-data-${{ hashFiles('**/nvdcve-1.1-modified.meta') }}"
which would yield Linux-dependency-check-data-a65b573939924c996fc1da62edc8e7c7c4cdf4ea9ec5b7a3c2b495ff37333213
, NOT the same as the cache that was restored and therefore not a cache hit.
Full output of the post-build step: https://gist.github.com/fransf-wtax/f69804e2f5ec2dd0c3b7577b618be3be
Conclusion
It looks like GitHub is somehow not or incorrectly evaluating the cache key expression in the post-build step. If it would re-evaluate, I believe the cache action would work as expected.
Issue Analytics
- State:
- Created a year ago
- Comments:8
Top GitHub Comments
Adding a simple boolean as a input to the action here to reevaluate in Post could be very helpful. e.g. the current default implementation
inputs:reevaluate:default:false
.I’m interested from the R programming community’s point of view when it comes to using code chunk caching within the {rmarkdown} package & specifically {knitr} chunk options.
{knitr} creates & invalidates it’s cache for each code chunk while the document is being rendered. One could invalidate the cache by simply hashing the file that the knitr code chunks are within, but this does not check that individual code chunks within the file have changed or that variable inputs to the chunks have changed (e.g. the reason to invalidate/bust the cache). A much simpler solution would be to hash all files within the knitr cache’s output directory (most people will not version this - it’s binary data).
A long way of saying: Time to learn some TypeScript 😉
That’s disappointing but consistent with the general feel I’m getting around GitHub Actions – that it’s an unfinished product that is really an internal tool for Microsoft or GitHub that was made public too quickly – there’s too much undocumented behaviour and several ways of achieving common tasks seem hackish. Too bad, it has potential, maybe we’ll check on it again in a few years.