question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement PEP 643 to optimise `pip download --no-binary`

See original GitHub issue

Currently, when pip downloads sdists it is sometimes necessary for it to build a wheel in order to get trustworthy metadata #7995 #1884. Ideally it wouldn’t need to do so, but that is a difficult problem to solve.

I would instead suggest that when pip does need to build a wheel during download it should default to saving that resulting wheel along-side the sdist package. This would let users avoid having to build the wheel twice in some cases, which could save a considerable amount of time.

While there are other cases where that wheel wouldn’t be useful to the user (they want to make modifications first, control build settings, etc) it doesn’t hurt them to have it, as it was being built anyway, and thus the time and disc space requirements are unchanged. People using pip download in scripts and the like may have to make changes to account for the additional file, so a transition plan would be needed.

I think the closest existing alternative to this is pip download --build ./tmp --no-clean. The main downside to that is knowing and remembering that it is necessary. Users are often surprised that pip download builds packages. Even when aware of the issue it can be easy to forget, especially when an sdist is downloaded because no appropriate wheels are available rather than because --no-binary was used.

This might dovetail with #9769 which aims to always make wheels as an intermediate step in installing sdists.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:18 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
uranusjrcommented, Jul 24, 2021

There are a few things we need to unpack here. First of all, pip does not build a wheel for metadata, but only requests to build package metadata, which is expected to require minimal effort. Some packages are configured in a, well, let’s say suboptimal way that a metadata build request results in an entire package build, but that fact is opaque to pip, and the build result is usually still short of a wheel, so technically there’s no wheel to save.

Even if we ignore all of that and say let’s just make pip build a wheel for metadata (which means we’re unreasonably punishing well-configured projects for less cooperative ones, but let’s assume we can magically avoid that), there is still the fundamental issue that a wheel build is not guaranteed to be stable. If you ever noticed, pip does not cache wheels for installation it built from source, because a compilation step is essentially remote code execution and there’s nothing stopping a package to produce a different wheels when built a second time (and a lot of projects do that in the real world in forms like build-time feature detection and conditional compilation). So making pip cache wheels built from source is introducing a cache invalidation issue (when can a wheel be reused?), one of the “two most difficult things” in computer science.

In the end, everything boils down to one fundamental fact that, without outside input, it is impossible to make sure a collection of source code is trustworthy in any way without building it from scratch. What pip needs is some additional flags in that source tree to tell pip what it can cache. And there is already a standard for that: PEP 643. If a source distribution implements that PEP, pip can determine the package metadata in it is trustworthy in fine granularity, and avoids the build step when it can. So what I would suggest is to contribute to wheel builders (e.g. setuptools) to see PEP 643 implemented (see pypa/setuptools#2685), and once those projects start generating sdists with appropriate metadata, pip can start doing what you want. Without that, anything pip does to solve your problems will cause us to get complaints from another group of people.

0reactions
tvalentyncommented, Oct 13, 2021

Thanks for the feedback, opened https://github.com/pypa/setuptools/discussions/2814.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Optimize `pip download --no-binary` for packages using PEP ...
Use pip download --no-deps --no-binary project-name; Use PyPI HTTP/JSON api to download a package, and completely avoid the metadata verification check: Avoid ...
Read more >
Pip download (just the source packages, no building, no ...
Hi. Sometimes all I want is the source for modules - and I'll work out the dependencies later - manually. So, I have...
Read more >
How to download Python dependencies - ActiveState
Use the pipdeptree utility to gather a list of all dependencies, create a requirements.txt file listing all the dependencies, and then download ......
Read more >
Optimize pip install with wheels - Roman Imankulov
The installed downloaded the archive and ran the setup.py script inside it. This format is known as “sdist”, source distribution. Source ...
Read more >
pip download - pip documentation v22.3.1
pip download does the same resolution and downloading as pip install ... Use PEP 517 for building source distributions (use --no-use-pep517 to force...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found