Performance opportunity for yarn 3 cache
See original GitHub issueThanks for the new cache feature. Much easier.
After few weeks, I realized that while it supports yarn, there’s some improvements that can be made.
Yarn 3 (probably yarn 2+ too) manages downloaded archives pretty well (.yarn/cache/*.zip
) and invalidating on yarn.lock changes does not take that into account and I saw a lot of cache misses.
As an example I converted back to action-cache to illustrate and test.
I’m wondering if a similar approach could be done with setup-node ?
Updated example with action cache
Setup action
# Get the yarn cache path.
- name: Get yarn cache directory path
id: yarn-cache-dir-path
run: echo "::set-output name=dir::$(yarn config get cacheFolder)"
- name: Restore yarn cache
uses: actions/cache@v2
id: yarn-cache # use this to check for `cache-hit` (`steps.yarn-cache.outputs.cache-hit != 'true'`)
with:
path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
key: yarn-cache-folder-${{ hashFiles('**/yarn.lock', '.yarnrc.yml') }}
restore-keys: |
yarn-cache-folder-
Testing a cache hit after adding a dependency
With setup node cache, as the yarn.lock have changed all packages would be fetched again (1m 28s) rather than (1s 124ms). Environmentally friendlier 🌳
yarn install --immutable
➤ YN0000: ┌ Fetch step
➤ YN0013: │ 1719 packages were already cached, one had to be fetched (superjson@npm:1.7.5)
➤ YN0000: └ Completed in 1s 124ms
Cache Size: ~127 MB (133261279 B)
Cache saved successfully
Cache saved with key: yarn-cache-folder-os-Linux-node--f118ea4bee07eada9df36ad2e83fd6febcbaf06b5b8962689c7650659e872ad3
PS: Key points
~Example with action cache~ (old version, before @merceyz improvements)
- name: Get yarn cache directory path
id: yarn-cache-dir-path
run: echo "::set-output name=dir::$(yarn config get cacheFolder)"
- name: Restore yarn cache
uses: actions/cache@v2
id: yarn-cache
with:
path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
key: yarn-cache-folder-os-${{ runner.os }}-node-${{ env.node-version }}-${{ hashFiles('**/yarn.lock', '.yarnrc.yml') }}
restore-keys: |
yarn-cache-folder-os-${{ runner.os }}-node-${{ env.node-version }}-
yarn-cache-folder-os-${{ runner.os }}-
Issue Analytics
- State:
- Created 2 years ago
- Reactions:19
- Comments:11 (3 by maintainers)
Top Results From Across the Web
yarn cache
Yarn stores every package in a global cache in your user directory on the file system. yarn cache list will print out every...
Read more >Prevent user cache from using large amounts of disk space in ...
On an Amazon EMR cluster, YARN is configured to allow jobs to write cache data to /mnt/yarn/usercache. When you process a large amount...
Read more >How to fix security vulnerabilities in Yarn - Debricked
Yarn is a popular Node.JS package manager, notable for improved performance and security compared to the default npm package manager.
Read more >Apache Hadoop 3.3.4 – Hadoop: Capacity Scheduler
Here is an example with three top-level child-queues a, ... <property> <name>yarn.scheduler.capacity.root.queues</name> <value>a,b,c</value> ...
Read more >YARN Properties in Cloudera Runtime 7.1 | CDP Private Cloud
This can improve performance of many jobs that are shuffle-intensive. Experimental in CDH 5.2. Related Name: Default Value: false; API Name ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Just to share, here’s my updated composite action for
It can be even more efficient by not including the OS and Node version in the cache key as well - https://github.com/actions/setup-node/pull/272#issuecomment-873564091