New Resolver chewing up bandwidth and time, for fully pinned requirements.
See original GitHub issueDescription
I’ve just upgraded to the new resolver, and am using pip in a highly bandwidth-constrained environment. All reqs are pinned, and working. Notably, nearly all have been installed before and should be cached locally.
Using the new resolver took at least two hours (I cancelled the command after that amount of time), and burned through 2GB (?!) of data. Switching to --use-feature=fast-deps --use-deprecated=legacy-resolver
installed in a matter of seconds.
This feels like a huge bug, both for the time, as reported elsewhere, but also for the bandwidth. 2GB is a huge amount of data in my context (about $10USD per GB).
Expected behavior
I’d expect for fully pinned requirements, that pip just install exactly what it was told to install.
pip version
21.0.1
Python version
3.8.9
OS
Linux (Ubuntu 18.04 via Docker)
How to Reproduce
- Start logging bandwidth and start a stopwatch.
pip install -r requirements.txt
(using the file below)- Watch how long that takes, and how much data is used.
- Restart logging.
pip install -r requirements.txt --use-feature=fast-deps --use-deprecated=legacy-resolver
- Watch how much time and data that takes.
Requirements.txt
asgiref==3.2.10
boto3==1.17.29
channels==2.4.0
channels-redis==3.0.1
cryptography>=2.3
csscompressor==0.9.5
django-annoying==0.10.4
django-anymail==6.0
django-compressor==2.2
django-celery-beat==1.6.0
django-celery-results==1.2.1
django-cors-headers==2.5.2
django-debug-toolbar==1.11.1
django-extensions==2.1.6
django-heroku-postgresify==0.4
django-hosts==3.0
django-htmlmin==0.11.0
django-ipware==2.1.0
django-localflavor==2.1
django-redis-cache==2.0.0
django-static-precompiler[watch]==1.8.2
django-storages==1.7.1
django==2.2.20
extraction==0.3
Faker==1.0.5
flake8==3.7.7
freezegun==1.1.0
gunicorn==20.0.4
hashids==1.2.0
honcho==1.0.1
html5lib==1.0.1
ipython==7.4.0
kombu==4.5.0
jwt==0.6.1
lxml==4.6.3
mistune==2.0.0a4
mock==2.0.0
pillow==8.1.1
piprot==0.9.10
psycopg2==2.8.2
pytz==2019.1
PyYAML==5.4
python-magic==0.4.22
stripe==2.54.0
sorl-thumbnail==12.5.0
tblib==1.3.2
user-agents==2.0
twisted==20.3.0
## The following requirements were added by pip freeze:
amqp==2.4.2
argh==0.26.2
args==0.1.0
asn1crypto==0.24.0
backcall==0.1.0
beautifulsoup4==4.7.1
billiard==3.6.0.0
bleach==3.1.4
certifi==2019.3.9
chardet==3.0.4
clint==0.4.1
decorator==4.4.0
dj-database-url==0.5.0
django-appconf==1.0.3
django-timezone-field==3.0
docutils==0.14
entrypoints==0.3
idna==2.6
ipython-genutils==0.2.0
jedi==0.13.3
jmespath==0.10.0
keyring==10.6.0
keyrings.alt==3.0
mccabe==0.6.1
parso==0.4.0
pathtools==0.1.2
pbr==5.1.3
pexpect==4.7.0
pickleshare==0.7.5
ply==3.11
prompt-toolkit==2.0.9
ptyprocess==0.6.0
pycodestyle==2.5.0
pyflakes==2.1.1
Pygments==2.7.4
python-crontab==2.3.6
python-dateutil==2.8.0
pyxdg==0.25
rcssmin==1.0.6
redis==3.5.3
requests==2.21.0
requests-futures==0.9.9
rjsmin==1.0.12
SecretStorage==2.3.1
six==1.11.0
soupsieve==1.9.1
sqlparse==0.3.0
text-unidecode==1.2
traitlets==5.0.5
ua-parser==0.8.0
urllib3==1.24.2
vine==1.3.0
watchdog==0.8.2
wcwidth==0.1.7
webencodings==0.5.1
uvicorn==0.11.8
Output
Scenario 1 I gave up on after two hours. >2GB of data used.
Scenario 2 ran in less than a minute, < ~100MB data used.
Code of Conduct
- I agree to follow the PSF Code of Conduct.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:10 (6 by maintainers)
If the resolver downloads more than one version of any package, your requirements are not “fully pinned” by definition 🙂 This is essentially by design, and you can see quite easily what is causing the extra downloads in the logs and fix the pin.
Hey @pradyunsg and @pfmoore , thanks for digging in.
A quick main reply on this: I really appreciate both of you digging in to what you’re hearing as the bug - huge, long install times with this particular set of reqs. But what I’m hoping for a reply on is how folks who are on metered-bandwidth setups (like most of the world) are supposed to use the new pip resolver.
Paul, I appreciate that your install in a fresh virtualenv only took 30 seconds. I can guarantee you’re on much, much faster internet than I currently have access to, and I’d guess you don’t pay per byte. Just downloading a small library (a few hundred KB) takes 30 seconds here. Those botocore resolves you hit are part of what burned through much of my time and bandwidth.
Pip obviously isn’t responsible for internet speeds or costs - but when there are some really large python packages, and it’s downloading the full package many times just to check compatibility, it’s unworkable. An ideal solution would be to keep the resolution smarts, but only download
setup.py
or a similar subset to establish dependencies - but I assume this was thought of and deemed unworkable. (If not, can this happen?)Given that, I’d love to know how to use pip, given that the new resolver is here to stay, in low-bandwidth and metered-bandwith environments. Thanks to you both.
On my specific reqs install:
Thanks for taking the time to dig in and test this locally. My installation is in a docker container, built fresh when reqs change. The trigger for this was an upgrade to python 3.8.8 and pip, and
docker-compose build
is what triggered all the trouble.I’ll try to dig through the git history to see the last time it was frozen, and what the change history was. This repo is about 2 years old, though the bones of the dependencies are from projects probably 6 or 8 years before that. But nothing should have been easy_installed. I’ll see if I can dig up when and what the versions for the last freeze was and try to debug what happened there.
I definitely hear you that botocore and probably more should be in the generated freeze list below, and isn’t.