Optimize `pip download --no-binary` for packages using PEP-517 by ignoring --no-binary when building isolated build environment for purposes of verifying downloaded sdist metadata
See original GitHub issueWhat’s the problem this feature will solve?
pip download --no-binary :all:
is sometimes too slow for packages that switched to use PEP-517, or have packages using PEP-517 in their dependency chain.
To observe this issue, run pip download in verbose mode:
pip -v download --dest /tmp numpy==1.21.2 --no-deps --no-binary :all:
Users have brought up this issue couple of times, however there seems to be some confusion as to actual reasons of this behavior.
Possible explanations that are likely incorrect:
- It is not a pip issue, but a setuptools issue.
- We can see that this issue also happens for packages using poetry. For example, see:
pip -v download --dest /tmp poetry --no-deps --no-binary :all:
- We can see that this issue also happens for packages using poetry. For example, see:
- It is not a pip issue, but it happens to packages using a legacy / convoluted build process.
- I believe this can happen to any package that declares build dependencies via PEP-517. Even installing Cython, which a common build dependency, takes significant time when it is installed from sources.
- It is a known issue, this issue is yet-another-duplicate of: https://github.com/pypa/pip/issues/1884
- Yes and no.
- Yes, it is known issue that pip needs to verify dependency metadata after downloading an sdist, so pip is building this metadata after downloading sources.
- However, there is a new aspect of the issue that gained higher severity after packages started to adopt PEP-517, because
pip download --no-binary
usability has decreased to the point of becoming unusable, example: https://github.com/pypa/pip/issues/9701.
Describe the solution you’d like
A user of pip download --no-binary
cares about sdists of the package and its dependencies. However a user likely doesn’t care whether sdists or bdists of build deps are used for package metadata verification purposes.
After pip download --no-binary
has downloaded an sdist, and starts verifying package metadata via prepare_metadata_for_build_wheel
hook, pip should ignore --no-binary
flag when creating an isolated build environment for the purpose of metadata verification.
Also if users don’t specify --no-deps
, and specify --no-binary
, chances are they may still want to download PEP-517 build dependencies as well, not just runtime dependencies, and if they specify --no-binary :all:
, then it would make sense to download and save sdists of build dependencies in target download folder (this doesn’t happen right now but may be worth a separate issue?). But in any case, we don’t have to install build deps from sdists for metadata verification, and can use bdists.
Alternative Solutions
- Wait until https://github.com/pypa/pip/issues/1884 is addressed. However that issue looks like it may be a harder problem to solve, based on the history of the issue and reasons for why metadata checks are done currently.
- Add a flag making the optimized behavior optional, smth like
--prefer-binary-in-pep-517-build-environment
- There are some workarounds that suffice for downloading a single package without it’s dependencies, but they do not work for downloading package and its runtime dependencies :
- Use
pip download --no-deps --no-binary project-name
- Use PyPI HTTP/JSON api to download a package, and completely avoid the metadata verification check: https://github.com/pypa/pip/issues/1884#issuecomment-800483766.
Additional context
Aspects of this issue have been discussed at various points in
- https://github.com/numpy/numpy/pull/14053
- https://github.com/pypa/pip/issues/7995
- https://github.com/pypa/pip/issues/8387
- https://github.com/pypa/pip/issues/1884
- https://github.com/pypa/pip/issues/10195
- https://github.com/pypa/setuptools/discussions/2814
Code of Conduct
- I agree to follow the PSF Code of Conduct.
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (7 by maintainers)
Top GitHub Comments
In a sense, we are having this conversation exactly because we listened to use cases and actually implemented them. As Paul said,
--no-binary :all:
exists because there is a concrete use case where people want pip to install exactly all things from source due to organisation policies, and it is both cumbersome and even error-prone to having to write down every package individually (for example if youpip install foo --no-binary=foo
and a new version offoo
adds a new dependencybar
you’d be violating the source-only policy unknowingly).Since the people requesting this new “packages that I consider a part of
:all:
” are the newcomers presenting a different use case, I don’t think it’s a good idea to hijack:all:
to mean something else. You need a new feature, and we are willing to listen, but that doesn’t allow you to take away a feature not designed for you.This semantic is shared between multiple pip commands (I think
download
,install
, andwheel
). The behaviour is more obvious for the latter two, because in those cases you’d expect pip to need the binaries (for installation), and--no-binary
means “please don’t download those binaries, always build from scratch”. This semantic is less obvious fordownload
, but it’s also not a good idea to make it mean anything else (that’d just confuse people in another way).