Implement PEP 643 to optimise `pip download --no-binary`
See original GitHub issueCurrently, when pip downloads sdists it is sometimes necessary for it to build a wheel in order to get trustworthy metadata #7995 #1884. Ideally it wouldn’t need to do so, but that is a difficult problem to solve.
I would instead suggest that when pip does need to build a wheel during download it should default to saving that resulting wheel along-side the sdist package. This would let users avoid having to build the wheel twice in some cases, which could save a considerable amount of time.
While there are other cases where that wheel wouldn’t be useful to the user (they want to make modifications first, control build settings, etc) it doesn’t hurt them to have it, as it was being built anyway, and thus the time and disc space requirements are unchanged. People using pip download
in scripts and the like may have to make changes to account for the additional file, so a transition plan would be needed.
I think the closest existing alternative to this is pip download --build ./tmp --no-clean
. The main downside to that is knowing and remembering that it is necessary. Users are often surprised that pip download
builds packages. Even when aware of the issue it can be easy to forget, especially when an sdist is downloaded because no appropriate wheels are available rather than because --no-binary
was used.
This might dovetail with #9769 which aims to always make wheels as an intermediate step in installing sdists.
Issue Analytics
- State:
- Created 2 years ago
- Comments:18 (11 by maintainers)
Top GitHub Comments
There are a few things we need to unpack here. First of all, pip does not build a wheel for metadata, but only requests to build package metadata, which is expected to require minimal effort. Some packages are configured in a, well, let’s say suboptimal way that a metadata build request results in an entire package build, but that fact is opaque to pip, and the build result is usually still short of a wheel, so technically there’s no wheel to save.
Even if we ignore all of that and say let’s just make pip build a wheel for metadata (which means we’re unreasonably punishing well-configured projects for less cooperative ones, but let’s assume we can magically avoid that), there is still the fundamental issue that a wheel build is not guaranteed to be stable. If you ever noticed, pip does not cache wheels for installation it built from source, because a compilation step is essentially remote code execution and there’s nothing stopping a package to produce a different wheels when built a second time (and a lot of projects do that in the real world in forms like build-time feature detection and conditional compilation). So making pip cache wheels built from source is introducing a cache invalidation issue (when can a wheel be reused?), one of the “two most difficult things” in computer science.
In the end, everything boils down to one fundamental fact that, without outside input, it is impossible to make sure a collection of source code is trustworthy in any way without building it from scratch. What pip needs is some additional flags in that source tree to tell pip what it can cache. And there is already a standard for that: PEP 643. If a source distribution implements that PEP, pip can determine the package metadata in it is trustworthy in fine granularity, and avoids the build step when it can. So what I would suggest is to contribute to wheel builders (e.g. setuptools) to see PEP 643 implemented (see pypa/setuptools#2685), and once those projects start generating sdists with appropriate metadata, pip can start doing what you want. Without that, anything pip does to solve your problems will cause us to get complaints from another group of people.
Thanks for the feedback, opened https://github.com/pypa/setuptools/discussions/2814.