question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dependency resolution differences (wrong) when using custom (i.e. not pypi) repository

See original GitHub issue
  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).
  • OS version and name: Windows 10 and Ubuntu 20.04
  • Poetry version: 1.1.8
  • Link of a Gist with the contents of your pyproject.toml file: not needed to explain

Issue

Any source that is defined in pyproject.toml that is not pypi, is always handled internally as a LegacyRepository.

That means metadata is not collected from API calls, but always by downloading and parsing packages, usually sdists.

I probably don’t have to explain how this is bad for performance in terms of speed, but you can see people notice it because it is quite significant! See for example #4113

Sometimes however, in cases where package metadata would have been available on an API endpoint, but poetry can’t figure out what the metadata is by parsing the sdist, this leads to problems in dependency resolution.

For example, scikit-image 0.17.2 sdist imports numpy in its setup.py, but it doesn’t specify any build requirements in pyproject.toml, so running setup.py fails. Poetry then just silently concludes scikit-image doesn’t have any dependencies, which is clearly wrong.

This is exactly what happens in #3464 and is also how I first encountered this bug.

If you install this package from pypi however, everything goes smoothly because the metadata is collected from the API endpoint instead.

So in short, for the exact same dependencies, depending on what source repository you use: pypi or something else, you may not get the same dependency resolution. Even if the alternative source is a direct reverse proxy to pypi.

Suggested fix

Option 1 - Fully automated

This is the ideal option. Poetry becomes clever enough to figure out for any source if it can provide metadata via an API just like pypi can. A mechanism needs to be built that tests this per configured source.

You could look at hostnames to try and optimize this guessing game a little bit.

Option 2 - User configurable

Allow users to configure the capabilities a source has available in pyproject.toml. This would basically put the responsibility with the user to tell poetry what APIs can be consumed.

[[tool.poetry.source]]
name = "foo"
url = "https://foo.bar/simple/"
capabilities = { foo = True, bar = True }

If you agree with one of the suggested improvements, I can do the work and open a PR. I’m pretty sure many users will reap the benefits in performance and correctness!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:5
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
samediicommented, Jan 24, 2022

Tip for others that are also affected by this. Until this is resolved we are moving to installing via git instead

➜ poetry add git+ssh://git@github.com:my-org/my-private-package.git#v0.4.8         

Updating dependencies
Resolving dependencies... (96.1s)

Writing lock file

Using private pypi:

➜ poetry add my-private-package 
Using version ^0.4.8 for my-private-package

Updating dependencies
Resolving dependencies... (217.9s)^C (Keyboard interrupt)

Installing with private pypi can take hours for us.

1reaction
MasterNayrucommented, Aug 26, 2021

I had read through the code when I had created my issue and my understanding of the cause matches yours. I think, though, that assuming that all PyPI-like backends support the simple API would be brave. For example, I was using AWS CodeArtifact as one of my PyPI backends, and that supports the legacy API and not the simple one.

I know that it is far from ideal, but it’s probably best to just try every front door for custom repositories and see which APIs are available to use, rather than just resorting to sdist downloads for all custom repositories. It’s a sad state of affairs when the thing storing packages can’t be trusted to answer really basic questions about what packages need to be installed correctly, but these performance and correctness issues really undercut a huge amount of the value add that users get from using Poetry. I appreciate that Poetry is trying to do the “right thing”, but it’s tiring to advocate for using tools like this and have it either take ages to do its calculations, especially when it’s not making use of available API endpoints to do so.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python Dependencies - Everything You Need to Know
Dependency Resolution – automatically ensures that all dependencies pulled in by a package are compatible with the rest of your Python ...
Read more >
Dependency Resolution - pip documentation v22.3.1
The process of determining which version of a dependency to install is known as dependency resolution. This behaviour can be disabled by passing...
Read more >
How can I make setuptools install a package that's not on PyPI?
Requirements files can install dependencies specified in setup.py with the following command: -e . setup.py can also install from repositories ...
Read more >
PyPI Repositories - JFrog - JFrog Documentation
If you use a custom PyPI remote repository, you need to make sure it ... This means that if your Artifactory instance is...
Read more >
Managing Python packages the right way - Opensource.com
When installing packages, pip will first resolve the dependencies, ... Could not install packages due to an EnvironmentError: [Error 13] ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found