PIP cache should cache the installed packages as well
See original GitHub issueDescription:
Currently, setup-python
caches only the ~/.cache/pip
directory to avoid redownloads. However, it doesn’t cache the installed packages. As some package have lengthy installation steps, this leads to delays in builds.
You can see the current behaviour for example in https://github.com/crabhi/setup-python-cache-test/actions/runs/1789016634 (or in attached build.txt) - the pip install
output shows “Collecting” and “Installing” instead of “Requirement already satisfied” for all packages.
Justification:
For example installing the ansible
package takes well over a minute even if it’s already downloaded.
Are you willing to submit a PR? Yes, I can try.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:17
- Comments:23 (9 by maintainers)
Top Results From Across the Web
How to cache downloaded PIP packages - Stack Overflow
You can use a specific environment variable PIP_DOWNLOAD_CACHE and make it point to a directory where your packages will be stored.
Read more >Caching - pip documentation v22.3.1
pip provides an on-by-default caching, designed to reduce the amount of time spent on duplicate downloads and builds.
Read more >Using cache for pip/npm dependencies in Gitlab CI
One thing I would like to point out about the Python example above is that it caches the venv directory it installs packages...
Read more >Python caching in GitHub Actions - AI2 Blog
The recommended way to speed this up is to use the cache action to cache the pip cache, which is basically a cache...
Read more >Pip Clear Cache - Linux Hint
The caching mechanism allows pip to improve the download and installation of the packages. This is because pip does not need to download...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sorry, @dhvcc if I didn’t manage to make myself clear.
actions/setup-python@v4
usesactions/cache@v3
under the hood and users do not need to call on theactions/cache@v3 module
in an example such as:It would be great if the installed packages could be cached as well (the purpose of this issue #330) through
actions/setup-python@v4
So I have done quite a deep dive into the venv corruption issue, and I believe I know what happened, and how to avoid it as well.
The version of Python between when my cache was created and when it was restored changed. And I had a generic restore key which matched the old cache key. See detailed explanation below.
This is how I had my yaml file was when I hit this error:
When this workflow initially ran and saved the venv to cache, the latest release of Python3.7 was 3.7.12 … meaning the venv created had symlinks to 3.7.12.
However, few days later when the workflow ran again, the latest release of Python3.7 was 3.7.13.
Notice in my workflow I don’t pin my Python patch version, so
actions/setup-python
downloaded the latest available patch release of Python 3.7 (as expected).However, my restore-key
pip-
matched the old cache, which restored the old venv created for Python 3.7.12 … meaning all the symlinks inside were now broken! I have setup Python 3.7.13 but am trying to use a venv with symlinks to 3.7.12! Hence why when I tried to call the python executable from the venv, it could not find the file!The resolution is to really ensure that the output of setup python is always part of the cache key. So any change in python version (even a patch version bump) would create a new cache key.
This is the code I have now, it has been working well without any issues. I have updated the workflow with the advice @dhvcc gave in the above comment. The venv is not touched if there is a cache hit.