Resolution of dependencies substantially slower when using private repositories
See original GitHub issue-
I am on the latest Poetry version.
-
I have searched the issues of this repo and believe that this is not a duplicate.
-
If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option). -
OS version and name: Fedora 34
-
Poetry version: 1.1.6
-
Link of a Gist with the contents of your pyproject.toml file: https://gist.github.com/MasterNayru/602f564ca19c68906fdb3c5a1eba2818
Issue
I am getting a huge difference in speed when resolving dependencies on the same project when only using PyPi vs using a private repository. The private repository (AWS CodeArtifact, in this case) is configured to look use PyPi as an upstream source for content when it isn’t found locally, which makes this difference in speed even more difficult for me to understand as the package contents should be identical for all packages in my project (I am not actually using any private packages in my project yet).
When I have only PyPi enabled (by commenting out the tool.poetry.source
block in my linked gist, after having run a poetry update
already so that the cache is fully populated, I get this time:
$ poetry update -vvv
<snip>
1: selecting importlib-metadata (4.3.0)
1: derived: zipp (>=0.5)
PyPI: 21 packages found for zipp >=0.5
1: selecting zipp (3.4.1)
PyPI: 32 packages found for importlib-resources >=1.0
1: fact: importlib-resources (5.1.4) depends on zipp (>=3.1.0)
1: selecting importlib-resources (5.1.4)
PyPI: 42 packages found for colorama *
1: selecting colorama (0.4.4)
PyPI: 7 packages found for atomicwrites >=1.0
1: selecting atomicwrites (1.4.0)
1: Version solving took 0.471 seconds.
1: Tried 1 solutions.
and when I enable that block, I get this time:
$ poetry update -vvv
<snip>
1: selecting importlib-metadata (4.3.0)
1: derived: zipp (>=0.5)
codeartifact: 21 packages found for zipp >=0.5
1: selecting zipp (3.4.1)
codeartifact: 32 packages found for importlib-resources >=1.0
1: fact: importlib-resources (5.1.4) depends on zipp (>=3.1.0)
1: selecting importlib-resources (5.1.4)
codeartifact: 42 packages found for colorama *
1: selecting colorama (0.4.4)
codeartifact: 7 packages found for atomicwrites >=1.0
1: selecting atomicwrites (1.4.0)
1: Version solving took 48.303 seconds.
1: Tried 1 solutions.
Is there any reason why enabling this repo would cause a 100x increase in resolution time when it has all of the wheels downloaded locally already? Is there something in the responses poetry gets from PyPi when it looks up packages that could explain the huge increase in time? If poetry actually looks in the wheels for information about the packages, given that the wheels will literally be from the same place, I am just really finding it hard to understand what is going on here.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:18
- Comments:5 (1 by maintainers)
+1 here… We use Gemfury to host ~5 private packages, out of some ~200 total in use in our project.
poetry update
takes forever.I just did an experiment:
poetry install
.poetry update
.For
poetry install
:For
poetry update
:(probably unnecessary to run version 1.1.5, 1.1.6 and 1.1.7, but at least they show consistency with version 1.1.4).
I ran some tests twice, and got fairly consistent numbers (less than 10 seconds apart, roughly).
So, at least in my case, something happened in version 1.1.4 to cause major slowdown of dependency resolution for my system 🤷.
I might revert back down to 1.1.3… or just take on the slowness. It’s not like we run fresh install or update often… And once the lock file is in place, installs are still fast.
Also - not sure really if the slowness is due to 3rd party dependencies - but figured to chime in on this thread at any rate.
Experiment run pretty much like this:
rm -rf poetry.lock .venv; make install_and_setup_poetry; time poetry install ; time poetry update
.make install_and_setup_poetry
is a Makefile target that installs particular version of poetry, which I changed in between tests.There are two parts to this issue: problems that can be solved in Poetry, and problems that are part of the ecosystem and must be solved there first.
For the first part, Poetry does unnecessary work with package lookups much of the time thanks to our resolution originally being modeled on pip’s “look for every match from every source equally” – we later added priorities and
source =
, but our source lookup model was not designed with this in mind. See #6713 for a proposal to implement a less surprising model of sources.For the second part, Poetry must download (and often even build) packages to gather metadata and recursively resolve dependencies. In order to reduce this work (and achieve the same performance we have with PyPI, which has a non-standard API that we can gather some of this information from), the accepted PEP 691 and PEP 658 standards need to be implemented by each of your custom repositories. The good news is that they are very deliberately fully backwards compatible with the existing API, and will not require any action from end users.
The bad news is this will take some time – PyPI has yet to implement 691, though a PR has been long in the works. Once PyPI gains support, we will likely implement this in Poetry and unify all of our source handling code to no longer special-case PyPI.