Enable always writing cache to support hermetic build systems
See original GitHub issueI’d like to use actions/cache
to cache my Bazel build state, which
includes dependencies that have been fetched, binaries and generated
code that have been built, and results for tests that have run. Bazel is
a hermetic build system, so the standard Bazel pattern is to always use
a single cache. Bazel will take care of invalidation at a fine-grained
level: if you only change one source file, it will only re-build and
re-test targets that depend on that source file.
Thus, the pattern that makes sense to me for Bazel projects is to always
fetch the cache and always store the cache. We can always fetch the
cache by using a constant cache key, but then the cache will never be
stored. Bazel doesn’t have a single package-lock.json
-style file that
can be used as a cache key; it’s the combination of all build and source
files in the whole repository. We could key use the Git tree (or commit)
hash as a cache key, but this would lead to storing a mountain of
caches, too, which seems wasteful.
Ideally, the fetched cache would be taken from origin/master
, but
really taking it from any recent commit should be fine, even if that
commit was in a broken or failing state.
On my repository, it takes 33 seconds to save the Bazel cache after a successful job, but on a clean cache it takes 2 minutes to fetch remote dependencies and 26 minutes to build all targets. I would be more than happy to pay those 33 seconds every time if it would save half an hour in the rest of the build!
For comparison, on Travis we achieve this by simply pointing to the Bazel cache directory: https://github.com/tensorflow/tensorboard/blob/1d1bd9a237fe23a3f2c31282ab44e7dfbcac717c/.travis.yml#L30-L32
Issue Analytics
- State:
- Created 4 years ago
- Reactions:32
- Comments:20 (4 by maintainers)
Top GitHub Comments
I believe I have a similar use case to the issue described here, and ideally would like to see an
update-cache
option added to the action, but I’ve worked around the issue by leveraging therestore-keys
option.A project of mine consists largely of C files, and naturally a significant portion of my CI cycle time is spent in compilation. To speed things up, I’ve employed ccache, which will opportunistically recycle previously built object files when it detects that the compilation would be the same for the current build. This has a dramatic performance improvement on CI times. In order to do this though, I need some persistence of storage between workflow runs in order to save and restore ccache’s cache directory. Of course, as the code base evolves, the cache of object files will change too.
I was pleased to discover actions/cache, as it fits my use case very nicely; but, I was surprised to find that when a cache hit occurs, actions/cache will not attempt to update the cache at all, and there’s not an option to request such update.
To work around this, I do the following:
It works like this: when the cache is loaded for a workflow, there will be an initial cache miss because the cache key contains the current commit sha. actions/cache will fall back to the most recently added cache via
restore-keys
prefix matching policy, then after the build has completed, create a new cache entry to satisfy the initial cache miss.This solution seems to work very well for me, and hopefully this will be useful to others with a similar use case. Ideally though, I think actions/cache should just support updating the cache, to a new immutable revision perhaps–as I have done above.
I was thinking we would run that server on behalf of the user so the operational overhead should be essentially the same is it would be for the existing cache action. I am not 100% sure that is the best option but it seems like it might be a really good one for build systems that support it.