Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PIP cache should cache the installed packages as well

See original GitHub issue

Description: Currently, setup-python caches only the ~/.cache/pip directory to avoid redownloads. However, it doesn’t cache the installed packages. As some package have lengthy installation steps, this leads to delays in builds.

You can see the current behaviour for example in https://github.com/crabhi/setup-python-cache-test/actions/runs/1789016634 (or in attached build.txt) - the pip install output shows “Collecting” and “Installing” instead of “Requirement already satisfied” for all packages.

Justification: For example installing the ansible package takes well over a minute even if it’s already downloaded.

Are you willing to submit a PR? Yes, I can try.

Issue Analytics

State:
Created 2 years ago
Reactions:17
Comments:23 (9 by maintainers)

Top GitHub Comments

9reactions

Axeln78commented, Jul 13, 2022

Sorry, @dhvcc if I didn’t manage to make myself clear. actions/setup-python@v4 uses actions/cache@v3 under the hood and users do not need to call on the actions/cache@v3 module in an example such as:

    - uses: actions/checkout@v3
    - name: Set up Python 3.10 and caches
      id: setup and cache
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
        cache: 'pip'

It would be great if the installed packages could be cached as well (the purpose of this issue #330) through actions/setup-python@v4

8reactions

rashidnhmcommented, May 14, 2022

Ok, nice. The code seemed ok, so that was strange. I’d only advise you to may be not run pip install if cache was hit implying you don’t want to modify cache in any way if it’s hit to avoid corruption

So I have done quite a deep dive into the venv corruption issue, and I believe I know what happened, and how to avoid it as well.

The version of Python between when my cache was created and when it was restored changed. And I had a generic restore key which matched the old cache key. See detailed explanation below.

This is how I had my yaml file was when I hit this error:

# BAD CONFIG DO NOT USE (Illustrative purposes only)

- uses: actions/checkout@v3

- id: setup_python
  uses: actions/setup-python@v3
  with:
    python-version: 3.7

- id: python_cache
  uses: actions/cache@v3
  with:
    path: venv
    key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('requirements.txt') }}
    restore-keys: |
      pip-${{ steps.setup_python.outputs.python-version }}-
      pip-  # This line in specific was the cause of the issue

- if: steps.python_cache.outputs.cache-hit != 'true'
  run: |
    python3 -m venv venv

- run: |
    venv/bin/python3 -m pip install -r requirements.txt

When this workflow initially ran and saved the venv to cache, the latest release of Python3.7 was 3.7.12 … meaning the venv created had symlinks to 3.7.12.

However, few days later when the workflow ran again, the latest release of Python3.7 was 3.7.13.

Notice in my workflow I don’t pin my Python patch version, so actions/setup-python downloaded the latest available patch release of Python 3.7 (as expected).

However, my restore-key pip- matched the old cache, which restored the old venv created for Python 3.7.12 … meaning all the symlinks inside were now broken! I have setup Python 3.7.13 but am trying to use a venv with symlinks to 3.7.12! Hence why when I tried to call the python executable from the venv, it could not find the file!

The resolution is to really ensure that the output of setup python is always part of the cache key. So any change in python version (even a patch version bump) would create a new cache key.

This is the code I have now, it has been working well without any issues. I have updated the workflow with the advice @dhvcc gave in the above comment. The venv is not touched if there is a cache hit.

- uses: actions/checkout@v3

- id: setup_python
  uses: actions/setup-python@v3
  with:
    python-version: 3.7

- id: python_cache
  uses: actions/cache@v3
  with:
    path: venv
    key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('requirements.txt') }}

- if: steps.python_cache.outputs.cache-hit != 'true'
  run: |
    # Check if venv exists (restored from secondary keys if any, and delete)
    # You might not need this line if you only have one primary key for the venv caching
    # I kept it in my code as a fail-safe
    if [ -d "venv" ]; then rm -rf venv; fi
    
    # Re-create the venv
    python3 -m venv venv

    # Install dependencies
    venv/bin/python3 -m pip install -r requirements.txt