question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pip uses backtracking when dependency installation fails

See original GitHub issue

Description

When installation of a dependency fails, pip uses the backtracking feature to try other versions of the package (even if the failure is not due to a version conflict)

Expected behavior

I understand that the backtracking is useful to solve version conflicts. Trying different versions when the installation fails for another reason than a version conflict is IMO not useful most of the time, as it often indicates a missing system package.

I find this particular annoying during CI tests, as it takes forever before the test actually fails. If this is intended behavior, it would be great to have a flag to disable it.

pip version

21.0.1

Python version

3.7.10

OS

arch linux

How to Reproduce

As an example, I install scikit-bio into a clean environment (which fails, because the package doesn’t properly declare the numpy dependency)

conda create -n test_skbio python=3.7 pip
conda activate test_skbio
pip install scikit-bio

Output

Collecting scikit-bio
  Using cached scikit-bio-0.5.6.tar.gz (8.4 MB)
    ERROR: Command errored out with exit status 1:
     command: /home/sturm/anaconda3/envs/test_skbio/bin/python3.7 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_e1c29669eff64467acdb675f656b2ef2/setup.py'"'"'; __file__='"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_e1c29669eff64467acdb675f656b2ef2/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /home/sturm/tmp/pip-pip-egg-info-95p85io3
         cwd: /home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_e1c29669eff64467acdb675f656b2ef2/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_e1c29669eff64467acdb675f656b2ef2/setup.py", line 20, in <module>
        import numpy as np
    ModuleNotFoundError: No module named 'numpy'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/66/b0/054ef21e024d24422882958072973cd192b492e004a3ce4e9614ef173d9b/scikit-bio-0.5.6.tar.gz#sha256=48b73ec53ce0ff2c2e3e05f3cfcf93527c1525a8d3e9dd4ae317b4219c37f0ea (from https://pypi.org/simple/scikit-bio/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached scikit-bio-0.5.5.tar.gz (8.3 MB)
    ERROR: Command errored out with exit status 1:
     command: /home/sturm/anaconda3/envs/test_skbio/bin/python3.7 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_ef40087a894243eea6e9ba7506c90c26/setup.py'"'"'; __file__='"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_ef40087a894243eea6e9ba7506c90c26/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /home/sturm/tmp/pip-pip-egg-info-p8h2qvwu
         cwd: /home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_ef40087a894243eea6e9ba7506c90c26/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_ef40087a894243eea6e9ba7506c90c26/setup.py", line 20, in <module>
        import numpy as np
    ModuleNotFoundError: No module named 'numpy'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/2d/ff/3a909ae8c212305846f7e87f86f3902408b55b958eccedf5d4349e76c671/scikit-bio-0.5.5.tar.gz#sha256=9fa813be66e88a994f7b7a68b8ba2216e205c525caa8585386ebdeebed6428df (from https://pypi.org/simple/scikit-bio/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  Using cached scikit-bio-0.5.4.tar.gz (8.3 MB)
    ERROR: Command errored out with exit status 1:
     command: /home/sturm/anaconda3/envs/test_skbio/bin/python3.7 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_aa90daa04e0549fbbd36b29262ef299e/setup.py'"'"'; __file__='"'"'/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_aa90daa04e0549fbbd36b29262ef299e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /home/sturm/tmp/pip-pip-egg-info-d6wu69n6
         cwd: /home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_aa90daa04e0549fbbd36b29262ef299e/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/sturm/tmp/pip-install-0wj1ulih/scikit-bio_aa90daa04e0549fbbd36b29262ef299e/setup.py", line 20, in <module>
        import numpy as np
    ModuleNotFoundError: No module named 'numpy'
    ----------------------------------------

Code of Conduct

I agree to follow the PSF Code of Conduct.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:4
  • Comments:20 (13 by maintainers)

github_iconTop GitHub Comments

2reactions
pfmoorecommented, Apr 7, 2021

I’ll keep working with the same isolated example: One package is pinned to an old version and another more recent package dependency has a reverse dependency to the same package, but with a new version.

I agree that the behaviour here is bad, but I don’t have enough information to understand why it’s happening yet. Let’s come back to that, though.

Regarding your proposed solution:

Have a global counter of the number of steps performed in backtracking

We have that. It’s called max_rounds and is set here, to 2000000. That may seem a lot, but it was originally set much lower, and we got users complaining that pip gave up too soon. We found that in terms of time spent, the number of rounds could be increased a lot without the time being affected too badly, so we increased to the current value.

The problem you have appears to be that in your case, the time spent is not because of too many rounds. But we don’t know what it is.

So we need more information. If you were to profile your case, and identify:

  1. Where the time is actually being spent.
  2. How many times pip does a step that gets thrown away by backtracking (please be careful here, we need details - trying to build 100 dependencies, finding a conflict and throwing them away is one backtrack, even if it takes many hours and 100 package builds were thrown away).
  3. In particular, what proportion of time did pip spend building stuff just to extract metadata (dependency information). Our best theory at the moment for all of these “pip takes ages” cases is that pip is building heaps of stuff because the only way to get dependency information for a sdist is to build it.
  4. What information pip has available when a backtrack occurs, and how much help that is in “pruning” the list of options remaining (hint: we’ve done this, and it’s really hard - see previous comment about “pip doesn’t know that a build failed because of missing system headers”)

Then, we might be able to determine where the problem lies in your case. Without trying to pre-judge, I’m fairly certain that the answer won’t be something pip can address easily (typically, it’s builds that take a long time to complete).

Some workarounds which I’m sure aren’t acceptable, but may give you some food for thought:

  1. Hit CTRL-C after the install has been going for 30 minutes. At that point, as a first step, you can assume that pip has gone into some sort of backtracking spiral, so add constraints to fix that. If you can’t work out how to do that, even with pip’s verbose log information, consider why you believe pip can. Equally, if you don’t know whether 30 minutes is the right length of time to wait, consider how pip could know any better than you.
  2. Pre-build any packages you might need for the install. The pip can just install wheels, which is extremely unlikely to be slow. That might be a pain, because you have to track dependencies to work out what’s needed - but that’s what pip has to do, so maybe that’s where the cost lies?
  3. A combination - kill the process, look at what pip needed to do, prebuild stuff, repeat.

None of these will fix the issue, but they may give you insights, and possibly even suggest a way forward. If you produce a proof of concept fix from that which helps your issue, we’d love to know.

Maybe I haven’t noticed cases where backtracking was active and helpful because it Just Worked®?

Quite probably. We have many millions of people using pip daily. And we’ve had people comment that the new resolver was a significant benefit for them. Honestly, do you really think we would have released the new resolver if we’d had feedback that it was a net loss? This is probably the most extensively publicised feature pip has ever released, and we did more user research on it than we ever had before (thanks to the funding we received). So yes, I’m afraid you are in a small minority here. I know that’s no help to you personally, but as pip maintainers we have to look at the wider picture.

When you find that people have benefited from backtracking, how many steps have typically run?

We have no idea. Nobody tells us anything when things work well. Maybe you can imagine how demoralising that can be? Particularly when people who raise issues assume we have all that information to hand 🙁

I think we’re just going round in circles now (ironic, really 😉). I suggest that if you want to make progress with this, you profile where pip is spending its time, as I suggested above, and give us some feedback on precisely what pip (or the build tool) is doing in all that time.

2reactions
grstcommented, Apr 2, 2021

I see! I still believe something like --fail-fast would be useful for CI builds.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resolving new pip backtracking runtime issue - Stack Overflow
The behavior seen is described as backtracking in the release notes. I understand why it is there. It specifies that I can use...
Read more >
Dependency Resolution - pip documentation v22.3.1
pip is capable of determining and installing the dependencies of packages. ... Thus, pip will “go back” (backtrack) and try to use another...
Read more >
User Guide - pip documentation v21.1.dev0
One way to ensure that the patched version is used consistently is to manually audit the dependencies of everything you install, and if...
Read more >
pip-tools - PyPI
Use it now with the --resolver=backtracking option to pip-compile. The legacy resolver will occasionally fail to resolve dependencies. The backtracking resolver ...
Read more >
Understanding Python Packages pip Dependency Resolver ...
So to summarise, with the older versions of pip ( <20.3 ), you will see an error message regarding the conflict. But the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found