`--use-deprecated=html5lib` does not parse links, even though they're present
See original GitHub issueDescription
When using Pip 22.0 with --use-deprecated=html5lib
with JFrog as the Index packages pip throws the error: ERROR: No matching distribution found for requests
Tested with the “requests” package on Windows 10 using pip 22.0 (fails) and pip 21.3.1 (works)
Expected behavior
--use-deprecated=html5lib
should allow JFrog indexes to work.
pip version
22.0
Python version
3.10
OS
Windows
How to Reproduce
Install package from JFrog index using pip 22.0
Output
C:\>python -m pip install -vvv requests --use-deprecated=html5lib
Using pip 22.0 from <corporate_local_path>\lib\site-packages\pip (python 3.10)
Non-user install by explicit request
Created temporary directory: <corporate_user_path>\AppData\Local\Temp\pip-ephem-wheel-cache-4a5e6ucc
Created temporary directory: <corporate_user_path>\AppData\Local\Temp\pip-req-tracker-p0zhtye3
Initialized build tracking at <corporate_user_path>\AppData\Local\Temp\pip-req-tracker-p0zhtye3
Created build tracker: <corporate_user_path>\AppData\Local\Temp\pip-req-tracker-p0zhtye3
Entered build tracker: <corporate_user_path>\AppData\Local\Temp\pip-req-tracker-p0zhtye3
Created temporary directory: <corporate_user_path>\AppData\Local\Temp\pip-install-_cnfjhxu
Looking in indexes: http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple
1 location(s) to search for versions of requests:
* http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/
Fetching project page and analyzing links: http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/
Getting page http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/
Found index url http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple
Looking up http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/ in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTP connection (1): <corporate_domain>:80
http://<corporate_domain>:80 "GET /artifactory/api/pypi/pypi-release/simple/requests/ HTTP/1.1" 200 None
Updating cache with response from http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/
Skipping link: not a file: http://<corporate_domain>/artifactory/api/pypi/pypi-release/simple/requests/
Given no hashes to check 0 links for project 'requests': discarding no candidates
ERROR: Could not find a version that satisfies the requirement requests (from versions: none)
ERROR: No matching distribution found for requests
Exception information:
Traceback (most recent call last):
File "<corporate_local_path>\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 348, in resolve
self._add_to_criteria(self.state.criteria, r, parent=None)
File "<corporate_local_path>\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 173, in _add_to_criteria
raise RequirementsConflicted(criterion)
pip._vendor.resolvelib.resolvers.RequirementsConflicted: Requirements conflict: SpecifierRequirement('requests')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<corporate_local_path>\lib\site-packages\pip\_internal\resolution\resolvelib\resolver.py", line 94, in resolve
result = self._result = resolver.resolve(
File "<corporate_local_path>\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 481, in resolve
state = resolution.resolve(requirements, max_rounds=max_rounds)
File "<corporate_local_path>\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 350, in resolve
raise ResolutionImpossible(e.criterion.information)
pip._vendor.resolvelib.resolvers.ResolutionImpossible: [RequirementInformation(requirement=SpecifierRequirement('requests'), parent=None)]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<corporate_local_path>\lib\site-packages\pip\_internal\cli\base_command.py", line 165, in exc_logging_wrapper
status = run_func(*args)
File "<corporate_local_path>\lib\site-packages\pip\_internal\cli\req_command.py", line 205, in wrapper
return func(self, options, args)
File "<corporate_local_path>\lib\site-packages\pip\_internal\commands\install.py", line 339, in run
requirement_set = resolver.resolve(
File "<corporate_local_path>\lib\site-packages\pip\_internal\resolution\resolvelib\resolver.py", line 103, in resolve
raise error from e
pip._internal.exceptions.DistributionNotFound: No matching distribution found for requests
Removed build tracker: '<corporate_user_path>\\AppData\\Local\\Temp\\pip-req-tracker-p0zhtye3'
Code of Conduct
- I agree to follow the PSF Code of Conduct.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:5 (5 by maintainers)
Top Results From Across the Web
html5lib - PyPI
html5lib is a pure-python library for parsing HTML. It is designed to conform to the WHATWG HTML specification, as is implemented by all...
Read more >How to get rid of BeautifulSoup user warning? - Stack Overflow
The solution to your problem is clearly stated in the error message. Code like the below does not specify an XML/HTML/etc. parser.
Read more >[NEXUS-31057] Pypi simple index should be proper HTML5 ...
Temporary workaround is to run pip with the “--use-deprecated=html5lib” flag. https://github.com/pypa/pip/issues/10825. $ pip install --upgrade ...
Read more >Azure Feeds breaks on newest version of Pip
Azure Feeds currently doesn't have it, which breaks Pip. Current workaround is switching to the deprecated html parser with a flag. Read more...
Read more >[Python-Dev] It's now time to deprecate the stdlib urllib module
I am not certain if we can deprecate/remove the whole 'urllib' module without ... There is heavy usage of urllib.parse in multiple projects ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The issue seems to be simply that the HTML doesn’t include a doctype (which seems to be required by PEP 503 and the HTML5 spec)
I’m unsure whether this is something where we should be lenient in what we accept.
Edit: Never mind, I missed that this was about the old parsing using html5lib.
I’m able to reproduce this, with just pip’s parsing logic:
21.3.1
22.0