question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Potentially better and faster pip caching

See original GitHub issue

I’ve followed the instructions to get pip cached successfully.

It’s important to understand that the pip cache only affects downloading of packages, not the actual installation. My app is not huge by any means, but still, even with cached downloads, installation takes about a minute or more.

In fact, without pip cache at all, downloads are actually pretty quick.

I had much better performance with this hack:

image

Notice that I’m caching the actual pip directories. I could hard code the value (I only test on ubuntu), but wanted to stick to the principles of getting the site path.

Still hacky, because of binaries. I need to be able to run pytest, so just caching site-packages was not enough. The getsitepackages() will give you something like this /opt/hostedtoolcache/Python/3.7.6/x64/lib/python3.7/site-packages/. And if you restore that directory, pip will know not to reinstall stuff.

However, your binaries will be missing from the path. So I traverse out a few dirs and cache the whole thing.

Not sure if it’s worth documenting this approach, or if there’s a better layer based cache coming off some sort that will make this moot.

But it cut down my pip setup from about 1 minute to 1 second.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:14 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
pradyunsgcommented, Jun 1, 2020

Please don’t cache site-packages or entire interpreter trees. That is fragile (sensitive to python patch version and OS) and pip should be pretty fast as long it’s cache is populated.

If there are instances where pip’s still slow, please file an issue on pip’s issue tracker (look for an existing one, before filing a new one), and we should put in the work to enhance pip. It’s not gonna magically happen on it’s own, but I don’t think pushing for fragile optimizations on CI providers is a good alternative approach.

Users who are willing to deal with the consequences of such a caching strategy can do so in their specific CI pipelines themselves. I would be very concerned if this became something that GitHub Actions or Azure Pipelines start suggesting users to do, or even make it easier by providing a mechanism to do so via an official-esque package.

3reactions
webknjazcommented, Feb 15, 2020

It’s “kinda” valid. But is safe to use only if the OS+Python version is exactly the same. So this has to be accounted for in the cache key.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Making pip installs a little less slow - Python⇒Speed
Installing packages with pip, Poetry, and Pipenv can be slow. Learn how to ensure it's not even slower, and a potential speed-up.
Read more >
How to leverage cache better for pip installs - Stack Overflow
There is a cache built-in to pip , I think the cache is stored somewhere in the user's home directory. I am making...
Read more >
PIP cache should cache the installed packages as well #330
Currently, setup-python caches only the ~/.cache/pip directory to avoid redownloads. However, it doesn't cache the installed packages.
Read more >
Package index mirrors and caches
Mirroring or caching of PyPI can be used to speed up local package ... devpi provides higher-level caching option, potentially shared ...
Read more >
Caching - pip documentation v22.3.1
pip provides an on-by-default caching, designed to reduce the amount of time spent on duplicate downloads and builds.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found