pip doesn't honor project index on root URL
See original GitHub issueEnvironment
- pip version: 20.3.3
- Python version: 3.8.6
- OS: Windows 10
Description
Per PEP 503, the root URL of an index contains URLs to each project, like https://pypi.org/simple/:
Within a repository, the root URL (/ for this PEP which represents the base URL) MUST be a valid HTML5 page with a single anchor element per project in the repository. The text of the anchor tag MUST be the name of the project and the href attribute MUST link to the URL for that particular project.
However, during my testing, pip
ignores the project index on root URL and directly accesses a self-constructed project URL.
Expected behavior
pip
should honor the project index on root URL.
How to Reproduce
Install a non-existing package from a non-existing index.
> pip install mypackage20201230 --index https://st20201230.blob.core.windows.net/simple --no-cache-dir -vvv
Output
Using pip 20.3.3 from d:\cli\edge-env\lib\site-packages\pip (python 3.8)
...
Looking in indexes: https://st20201230.blob.core.windows.net/simple
1 location(s) to search for versions of mypackage20201230:
* https://st20201230.blob.core.windows.net/simple/mypackage20201230/
Fetching project page and analyzing links: https://st20201230.blob.core.windows.net/simple/mypackage20201230/
Getting page https://st20201230.blob.core.windows.net/simple/mypackage20201230/
Found index url https://st20201230.blob.core.windows.net/simple
Starting new HTTPS connection (1): st20201230.blob.core.windows.net:443
Incremented Retry for (url='/simple/mypackage20201230/'): Retry(total=4, connect=None, read=None, redirect=None, status=None)
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x000002C16C6EE3A0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /simple/mypackage20201230/
Even if the index contains valid content, such as
https://st20201230.blob.core.windows.net/beta/simple/
<html>
<head>
<title>Simple Index</title>
</head>
<body>
<h1>Simple Index</h1>
<a href="https://azurecliprod.blob.core.windows.net/edge/azure-cli/">azure-cli/</a><br />
</body>
</html>
The project URL https://azurecliprod.blob.core.windows.net/edge/azure-cli/ in the index is not honored and the installation fails trying to access https://st20201230.blob.core.windows.net/beta/simple/azure-cli/ instead:
> pip install azure-cli --index https://st20201230.blob.core.windows.net/beta/simple/ --no-cache-dir -vvv
Using pip 20.3.3 from d:\cli\edge-env\lib\site-packages\pip (python 3.8)
...
Looking in indexes: https://st20201230.blob.core.windows.net/beta/simple/
1 location(s) to search for versions of azure-cli:
* https://st20201230.blob.core.windows.net/beta/simple/azure-cli/
Fetching project page and analyzing links: https://st20201230.blob.core.windows.net/beta/simple/azure-cli/
Getting page https://st20201230.blob.core.windows.net/beta/simple/azure-cli/
Found index url https://st20201230.blob.core.windows.net/beta/simple/
Starting new HTTPS connection (1): st20201230.blob.core.windows.net:443
https://st20201230.blob.core.windows.net:443 "GET /beta/simple/azure-cli/ HTTP/1.1" 404 215
Could not fetch URL https://st20201230.blob.core.windows.net/beta/simple/azure-cli/: 404 Client Error: The specified blob does not exist. for url: https://st20201230.blob.core.windows.net/beta/simple/azure-cli/ - skipping
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (6 by maintainers)
So project
azure-cli
in indexst20201230.blob.core.windows.net/beta/simple
must be available under URLst20201230.blob.core.windows.net/beta/simple/azure-cli/
.Edit: I believe it is possible to use redirects to achieve this; the page can of course have other canonical URLs, but it must be accesible under
<repository-index>/<project>/
.Sure. I frequently find PEP vague about some details. Thank you very much for making things clear.