question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fails to parse data-requires-python with a `*` char

See original GitHub issue

Bug description

pip-audit raises an InvalidSpecifier error when trying to parse a data-requires-python key containing a * from pypi. Seems related to this old issue : https://github.com/pypa/pip-audit/issues/138

Reproduction steps

I found the nltk package wich contains this link in its pypi page raising the mentioned error:

<a 
  href="https://files.pythonhosted.org/packages/98/06/de681159e6750d0a215c2126e784504e177896afddf5a68cba42ebe42355/nltk-3.6-py3-none-any.whl#sha256=718de6908f538db19a77f96b9e6f5f586b0892d7de5eea32e71f2a2535ed8657" 
  data-requires-python="&gt;=3.5.*" 
>
  nltk-3.6-py3-none-any.whl
</a><br />

To reproduce, we only need to launch a pip-audit with this dependency.

echo "nltk" > requirements.txt
pip-audit -v --requirement requirements.txt                       

Screenshots and logs

DEBUG:pip_audit._cli:parsed arguments: Namespace(cache_dir=None, desc=<VulnerabilityDescriptionChoice.Auto: 'auto'>, dry_run=False, extra_index_urls=[], fix=False, format=<OutputFormatChoice.Columns: 'columns'>, ignore_vulns=[], index_url='https://pypi.org/simple/', local=False, no_deps=False, output=PosixPath('stdout'), paths=[], progress_spinner=<ProgressSpinnerChoice.On: 'on'>, project_path=None, require_hashes=False, requirements=[<_io.TextIOWrapper name='test_require.txt' mode='r' encoding='UTF-8'>], skip_editable=False, strict=False, timeout=15, verbose=True, vulnerability_service=<VulnerabilityServiceChoice.Pypi: 'pypi'>)
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/simple/nltk/" in the cache
DEBUG:cachecontrol.controller:Current age based on date: 1186
DEBUG:cachecontrol.controller:Freshness lifetime from max-age: 600
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pypi.org:443
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /simple/nltk/ HTTP/1.1" 304 0
Traceback (most recent call last):
  File "/usr/local/bin/pip-audit", line 8, in <module>
    sys.exit(audit())
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_cli.py", line 448, in audit
    for (spec, vulns) in auditor.audit(source):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_audit.py", line 67, in audit
    for dep, vulns in self._service.query_all(specs):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_service/interface.py", line 155, in query_all
    for spec in specs:
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/requirement.py", line 116, in collect
    for _, dep in self._collect_cached_deps(filename, reqs):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/requirement.py", line 328, in _collect_cached_deps
    for req, resolved_deps in self._resolver.resolve_all(iter(req_values)):
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/interface.py", line 88, in resolve_all
    yield (req, self.resolve(req))
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/resolvelib.py", line 77, in resolve
    result = self.resolver.resolve([req])
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 521, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 372, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/usr/local/lib/python3.8/site-packages/resolvelib/resolvers.py", line 168, in _add_to_criteria
    candidates=build_iter_view(matches),
  File "/usr/local/lib/python3.8/site-packages/resolvelib/structs.py", line 169, in build_iter_view
    matches = list(matches)
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 362, in find_matches
    [
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 362, in <listcomp>
    [
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 203, in get_project_from_indexes
    yield from get_project_from_index(index_url, session, project, extras, timeout, state)
  File "/usr/local/lib/python3.8/site-packages/pip_audit/_dependency_source/resolvelib/pypi_provider.py", line 249, in get_project_from_index
    spec = SpecifierSet(py_req)
  File "/usr/local/lib/python3.8/site-packages/packaging/specifiers.py", line 700, in __init__
    parsed.add(Specifier(specifier))
  File "/usr/local/lib/python3.8/site-packages/packaging/specifiers.py", line 234, in __init__
    raise InvalidSpecifier(f"Invalid specifier: '{spec}'")
packaging.specifiers.InvalidSpecifier: Invalid specifier: '>=3.5.*'

Platform information

  • Linux
  • pip-audit 2.4.9
  • Python 3.8.2
  • pip 22.3.1

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:12 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
j0ackcommented, Dec 22, 2022

You’re very welcome @woodruffw, I did not think the issue would be resolved only few hours after posting it. I think it will work with the code you merged in #447 Thank you

1reaction
pradyunsgcommented, Dec 22, 2022

pip is still on packaging 21.3; which is why it still accepts that. I expect we’ll start rejecting/skipping those in 3-6 months from now (or at least, being really noisy in warning about them).

Read more comments on GitHub >

github_iconTop Results From Across the Web

pip install fails with failed to parse error - Stack Overflow
ERROR: Could not install packages due to an EnvironmentError: Failed to parse: you likely need to escape characters like @ or % in...
Read more >
Failed to parse graph (json-ld, JSONDecodeError) · Issue #1423
I've got a little json-ld snippet that works fine with https://www.easyrdf.org/converter but it can't be loaded by rdflib (6.0.1, py 3.9).
Read more >
Parsing arguments and building values — Python 3.11.1 ...
This format requires two arguments. The first is only used as input, and must be a const char* which points to the name ......
Read more >
Common CSV Template Error Messages and How to Fix Them
Failed to parse file at row [516731]: Json mapping exception: ... in the data are properly escaped with an additional quote character.
Read more >
Strings and Character Data in Python
Processing character data is integral to programming. It is a rare application ... Attempting to index beyond the end of the string results...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found