Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Enable always writing cache to support hermetic build systems

See original GitHub issue

I’d like to use actions/cache to cache my Bazel build state, which includes dependencies that have been fetched, binaries and generated code that have been built, and results for tests that have run. Bazel is a hermetic build system, so the standard Bazel pattern is to always use a single cache. Bazel will take care of invalidation at a fine-grained level: if you only change one source file, it will only re-build and re-test targets that depend on that source file.

Thus, the pattern that makes sense to me for Bazel projects is to always fetch the cache and always store the cache. We can always fetch the cache by using a constant cache key, but then the cache will never be stored. Bazel doesn’t have a single package-lock.json-style file that can be used as a cache key; it’s the combination of all build and source files in the whole repository. We could key use the Git tree (or commit) hash as a cache key, but this would lead to storing a mountain of caches, too, which seems wasteful.

Ideally, the fetched cache would be taken from origin/master, but really taking it from any recent commit should be fine, even if that commit was in a broken or failing state.

On my repository, it takes 33 seconds to save the Bazel cache after a successful job, but on a clean cache it takes 2 minutes to fetch remote dependencies and 26 minutes to build all targets. I would be more than happy to pay those 33 seconds every time if it would save half an hour in the rest of the build!

For comparison, on Travis we achieve this by simply pointing to the Bazel cache directory: https://github.com/tensorflow/tensorboard/blob/1d1bd9a237fe23a3f2c31282ab44e7dfbcac717c/.travis.yml#L30-L32

Issue Analytics

State:
Created 4 years ago
Reactions:32
Comments:20 (4 by maintainers)

Top GitHub Comments

18reactions

mborgersoncommented, Nov 26, 2019

I believe I have a similar use case to the issue described here, and ideally would like to see an update-cache option added to the action, but I’ve worked around the issue by leveraging the restore-keys option.

A project of mine consists largely of C files, and naturally a significant portion of my CI cycle time is spent in compilation. To speed things up, I’ve employed ccache, which will opportunistically recycle previously built object files when it detects that the compilation would be the same for the current build. This has a dramatic performance improvement on CI times. In order to do this though, I need some persistence of storage between workflow runs in order to save and restore ccache’s cache directory. Of course, as the code base evolves, the cache of object files will change too.

I was pleased to discover actions/cache, as it fits my use case very nicely; but, I was surprised to find that when a cache hit occurs, actions/cache will not attempt to update the cache at all, and there’s not an option to request such update.

To work around this, I do the following:

    - name: Initialize Compiler Cache
      id: cache
      uses: actions/cache@v1
      with:
        path: /tmp/xqemu-ccache
        key: cache-${{ runner.os }}-${{ matrix.configuration }}-${{ github.sha }}
        restore-keys: cache-${{ runner.os }}-${{ matrix.configuration }}-

It works like this: when the cache is loaded for a workflow, there will be an initial cache miss because the cache key contains the current commit sha. actions/cache will fall back to the most recently added cache via restore-keys prefix matching policy, then after the build has completed, create a new cache entry to satisfy the initial cache miss.

This solution seems to work very well for me, and hopefully this will be useful to others with a similar use case. Ideally though, I think actions/cache should just support updating the cache, to a new immutable revision perhaps–as I have done above.

6reactions

chrispatcommented, Nov 27, 2019

A truly remote cache is an appealing option, but comes with a lot more operational overhead for the user. Storing files is much easier than running a server.

I was thinking we would run that server on behalf of the user so the operational overhead should be essentially the same is it would be for the existing cache action. I am not 100% sure that is the best option but it seems like it might be a really good one for build systems that support it.

Top Results From Across the Web

Enable always writing cache to support hermetic build systems

I'd like to use actions/cache to cache my Bazel build state, which includes dependencies that have been fetched, binaries and generated

Hermeticity - Bazel

This page covers hermeticity, the benefits of using hermetic builds, and strategies for identifying non-hermetic behavior in your builds. Overview. When given ...

How to keep a Bazel project hermetic? - Tweag

A collection of helpful tips for Bazel users.

Hermetic declarative build systems like Bazel feel exactly like ...

The issue is that declarative build systems ALWAYS need an escape hatch for the ... Just write the path to the file, or...

Update travis.sh · lifting-bits/cxx-common@8d807ca · GitHub

https://github.com/actions/cache/issues/109 "Enable always writing cache to support hermetic build systems".