question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

downloads of github releases aren't cached

See original GitHub issue

Environment

  • pip version: 20.2
  • Python version: 3.6.10
  • OS: macOS

Description

Downloads of github release packages are not cached.

Expected behavior

Subsequent downloads read from the local cache, instead of downloading again.

How to Reproduce

$ python3 -m venv --clear /tmp/venv
$ /tmp/venv/bin/pip install --upgrade pip
Collecting pip
  Using cached https://files.pythonhosted.org/packages/36/74/38c2410d688ac7b48afa07d413674afc1f903c1c1f854de51dc8eb2367a5/pip-20.2-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-20.2

$ /tmp/venv/bin/pip install https://github.com/tekumara/spark/releases/download/v2.4.5-cloud/pyspark-2.4.5.tar.gz

Collecting https://github.com/tekumara/spark/releases/download/v2.4.5-cloud/pyspark-2.4.5.tar.gz
  Downloading https://github.com/tekumara/spark/releases/download/v2.4.5-cloud/pyspark-2.4.5.tar.gz (268.3 MB)
     |███████▌                        | 63.1 MB 4.5 MB/s eta 0:00:46

When the above is repeated, pyspark-2.4.5.tar.gz is downloaded again, rather than used from the cache.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
tekumaracommented, Oct 18, 2020

So if I use a named urlspec, pip will use the cached version rather than downloading it again:

$ pip install 'pyspark @ https://github.com/tekumara/spark/releases/download/v2.4.5-cloud/pyspark-2.4.5.tar.gz'
Processing /Users/tekumara/Library/Caches/pip/wheels/4d/8c/99/636fbcc2942d25483272d77c0654cd921907ff73717f7b5627/pyspark-2.4.5-py2.py3-none-any.whl
Collecting py4j==0.10.7
  Using cached py4j-0.10.7-py2.py3-none-any.whl (197 kB)
Installing collected packages: py4j, pyspark

This is using pip 20.2.3. Seems like a pretty workable solution.

1reaction
uranusjrcommented, Jul 29, 2020

@pfmoore Ah right, I confused this with the wheel cache rules, which does not apply here, of course. HTTP caching should be fine. Sorry.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Github: Can I see the number of downloads for a repo?
If you want to see how many people are downloading a release you need to upload a release zip. As far as I...
Read more >
Caching in GitLab CI/CD
A cache is one or more files a job downloads and saves. ... Use cache for dependencies, like packages you download from the...
Read more >
Caching Dependencies and Directories - Travis CI Docs
When creating the cache, symbolic links are not followed. ... which is useful for storing dependencies that take longer to compile or download....
Read more >
Pipeline caching - Azure - Microsoft Learn
Pipeline caching can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, ......
Read more >
Mirroring Releases from GitHub - klose.dev
If you are not interested in the details, you can safely skip the next ... Building the caching server: Download GitHub Releases to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found