question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Only pull in files from --extra-index-url if the package exists on --index-url

See original GitHub issue

What’s the problem this feature will solve?

I was thinking if there’s a middle ground for #8606 between change nothing (behaviourally, not considering user education issues) and straight out removing --extra-index-url (which most seem to disagree with).

From what I can tell, there are two main usages of --extra-index-url:

  1. To serve files supplementing packages publishes the main index. This is the intended usage (and what piwheels is doing).
  2. To serve packages not available on the main index. This is what we don’t want people to do since it is suspect to supply chain attacks (where the “attacker” may be the users themselves).

So I considered the difference between the two usages and tried to come up with a strategy that keeps the first use case working unchanged, but makes the second fail early so people are directed to other solutions before thinge become dangerous.

Describe the solution you’d like

Considering the most popular scenario where --index-url is set to pypi.org. For pip install mypackage --extra-index-url=https://myindex, files listed on https://myindex/mypackage/ is collected by pip only if https://pypi.org/mypackage/ returns a non-error response with at least one file on the page. In other words, the main index also acts as pip’s “canonical package name registry”, and extra indexes may only add files under names provided by the registry, and cannot register new project names. I think this should cover most legistimate --extra-index-url usages (I think).

This will also break most of the inappropriate usages, since those private packages are generally not available on PyPI and won’t be picked up by pip (pip will need to emit a warning saying the project is found but ignored so they don’t think this is a bug). The “natural” response to these people would be to swap the indexes (errornously thinking the index order is significant), and that would still break because their private index does not serve PyPI projects. The only way to correctly make their private project installable is to point --index-url to an index that contains both PyPI and their private projects—which is our recommended best practice.

The only variant not solved by this design would be if mypackage does exist on PyPI, and someone relies on https://myindex to provide a newer version than PyPI. But people doing this are very likely already fully aware of the possibilities a newer mypackage on PyPI will break the workflow (and willing to take the risk). They think they know what they’re doing, so I say whatever, let’s not stop them.

Additional context

Question for @bennuttall: Does piwheels currently serve projects that does not exist on PyPI? This design would keep everything working if it does not. If it does, would it be difficult to publish them also to PyPI?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:16 (14 by maintainers)

github_iconTop GitHub Comments

2reactions
brianfairservicecommented, Apr 12, 2021

As you know, this change will disrupt people who are depending on private packages from internal indexes on the extra-index-url, and also depending on packages from the public pypi on the index-url. I was wondering what the timeline looks like for implementing this change? GitLab is planning a change which will mitigate the aforementioned problem (https://gitlab.com/gitlab-org/gitlab/-/issues/233413), and I’m hoping the change in pip will come about after GitLab gets their change in.

2reactions
uranusjrcommented, Mar 30, 2021

Yes. The public interface is organised well, but it’s pretty difficult to change what’s going on in the implementation, namely LinkCollector.collect_links(). Unfortunately this change must do that, since the current implementation throws away index information eagerly, making it impossible to track what links are obtained from the “main” index URL.

Read more comments on GitHub >

github_iconTop Results From Across the Web

extra-index-url` to install private packages from GitLab CI
So I only see 3 solutions for this problem: Enable hash-checking mode by freezing my setup.py dependencies to a requirements.
Read more >
python pip priority order with index-url and extra-index-url
The short answer is: there is no prioritization and you probably should avoid using --extra-index-url entirely. Let's assume packageX exists in ...
Read more >
PyPI Repositories
Set the Package Type to PyPI and enter the Repository Key value. The URL ... The index URL can be specified in the...
Read more >
Bug #1833229 “`--extra-index-url` not working for PIP”
[Impact] * The --extra-index-url feature is not working when an ... Trying to install a package that should be present on the `--index-url`...
Read more >
Update PyPI installation instruction to mitigate CVE-2018-20225
--index-url changes the default package index pypi.org to the one you ... to only pull in files from --extra-index-url if the package exists...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found