question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Github action is stuck when trying to restore the cache

See original GitHub issue

👋 It seems I have encountered an issue while trying to restore the cache on Github actions. (Sorry I cannot post the repo as is a private one).

Here is some of the cache setup and a screenshot of the action stuck on cache restore.

with:
  path: ./vendor/bundle
  key: v1-bundle-${{ runner.OS }}-${{ hashFiles('.github/workflows/main.yml') }}-gemfile-${{ hashFiles('Gemfile.lock') }}
  restore-keys: |
    v1-bundle-${{ runner.OS }}-${{ hashFiles('.github/workflows/main.yml') }}-gemfile-${{ hashFiles('Gemfile.lock') }}
    v1-bundle-${{ runner.OS }}-${{ hashFiles('.github/workflows/main.yml') }}-
    v1-bundle-${{ runner.OS }}-
Screenshot 2020-03-05 at 15 42 29

As it appears in the screenshot I had to stop the action after 15mins. I can see from some data points that the cache restore usually fluctuates between 5sec - 1min and the cache size: is around 150 MB .

I hope the above are helpful. 🙏

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:16 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
berkoscommented, Apr 23, 2020

👋 Hey @kevinrobinson thanks for reporting this. We’re looking into our telemetry and seeing what could cause these cache restore timeouts.

How often would you say you hit these issues? Is it always with workflows that multiple jobs pulling from the same cache?

Hey @joshmgross I can also confirm that the cache freezes on parallel jobs that pull from the same cache that a depended job has created.

Unfortunately logs do not provide enough data. On the last line it’s me cancelling the job after 11 minutes for no progression.

2020-04-15T12:55:53.6484950Z ##[group]Run actions/cache@v1
2020-04-15T12:55:53.6485081Z with:
2020-04-15T12:55:53.6485178Z   path: ./vendor/bundle
2020-04-15T12:55:53.6485313Z   key: v1-bundle-Linux-fc51bfd6d906381f2bcacc1b5b1984f16461e7b18005cd73f235c4e3eab32a67-gemfile-badc4979dafd7179b692ffe63d166b2f09b6d649d7a65a659f8ee6f0e17e3d33
2020-04-15T12:55:53.6485493Z   restore-keys: v1-bundle-Linux-fc51bfd6d906381f2bcacc1b5b1984f16461e7b18005cd73f235c4e3eab32a67-gemfile-badc4979dafd7179b692ffe63d166b2f09b6d649d7a65a659f8ee6f0e17e3d33
v1-bundle-Linux-fc51bfd6d906381f2bcacc1b5b1984f16461e7b18005cd73f235c4e3eab32a67-
v1-bundle-Linux-

2020-04-15T12:55:53.6485683Z env:
2020-04-15T12:55:53.6485784Z   CONTINUOUS_INTEGRATION: true
2020-04-15T12:55:53.6485887Z   COVERAGE: true
2020-04-15T12:55:53.6485985Z   DISABLE_SPRING: true
2020-04-15T12:55:53.6486081Z   PGHOST: 127.0.0.1
2020-04-15T12:55:53.6486179Z   PGUSER: --
2020-04-15T12:55:53.6486282Z   PGPASSWORD: --
2020-04-15T12:55:53.6486351Z   PROFILE_RSPEC: true
2020-04-15T12:55:53.6486500Z   RAILS_CACHE_CLASSES: true
2020-04-15T12:55:53.6486602Z   RAILS_EAGER_LOAD: true
2020-04-15T12:55:53.6486699Z   RAILS_ENV: test
2020-04-15T12:55:53.6487172Z   RAILS_MASTER_KEY: ***
2020-04-15T12:55:53.6487246Z   SPEC_TIMEOUT_ENABLED: true
2020-04-15T12:55:53.6487348Z ##[endgroup]
2020-04-15T13:11:55.7018370Z ##[error]The operation was canceled.

Interesting thing is that it does not happens always. It was quite stable for 3 weeks and started again last week. I’ve changed the workflows/main.yml to make a few extra checks if: steps.cache_bundler.outputs.cache-hit != 'true' in case cache tries to be updated while there is a cache hit. I’ll post again if I have more findings. Thanks 😃

2reactions
kevinrobinsoncommented, Apr 17, 2020

@joshmgross sure! I’ve only set up GitHub Actions for one repository, and all the Ruby tasks for that repository run multiple jobs in parallel that all read from the same cache. It’s to parallelize rspec tests in CI (which, when the caching is working, is amazing and cuts the total test time down to 25% of what it was before! 👍). I’ve run into this issue on maybe ~40% of the days I’ve run these tasks over the last week or two, so timeouts and latency degradation this severe have been a major recurring issue, and not isolated to a single point in time.

I’ve shared some links and debug logs above, but let me know whatever other info would help y’all debug. Thanks! 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cache Stuck · Issue #658 · actions/toolkit - GitHub
All my actions are running, but are refusing to cache with the same error: Post job cleanup. Unable to reserve cache with key...
Read more >
Cache hits don't occur in Github Actions workflows · Issue #451
I solved it by defining the cache-dir outside the node_modules directory. If I let the cache dir inside the node_modules , the hash...
Read more >
upgrade @action/cache to 3.0.4 to fix stuck issue #573 - GitHub
Description: @action/cache 3.0.3 fixs a download stuck issue checkout https://github.com/actions/toolkit/blob/main/packages/cache/RELEASES.md#303 Related ...
Read more >
Marketplace Actions Cache - GitHub
Cache artifacts like dependencies and build outputs to improve workflow execution time.
Read more >
Unable to restore cache to certain locations #506 - GitHub
This action seems to unable to restore cache to certain locations. In this case, that would be paths starting with C:\ on Windows...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found