question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hash checking: respect third party indices

See original GitHub issue

Is your feature request related to a problem? Please describe. Yes.

Currently if you are using a privately hosted repo and building your own wheels for various projects, you cannot use pip-audit -s pypi, while only listing the hashes of your privately hosted wheels. This leaves us with 2 options. Adding the upstream digest hash ( that we dont use ) to our hash list for each package. Or using -s osv.

Currently we are using osv. And though I am unsure if this is inferior. We would like to use the default where we can.

Describe the solution you’d like

Ideally - https://github.com/pypa/pip-audit/blob/76e4fa4a0ed005f543701d4b60c02c75c5b539d2/pip_audit/_service/pypi.py#L89

Would have logic have a flag around it to disable hash checking for situations like this.

Describe alternatives you’ve considered

As mentioned above using -s osv

Additional context

Example of current failure:

DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): pypi.org:443
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /pypi/adal/1.2.2/json HTTP/1.1" 200 1128
DEBUG:cachecontrol.controller:Updating cache with response from "https://pypi.org/pypi/adal/1.2.2/json"
DEBUG:cachecontrol.controller:etag object cached for 1209600 seconds
DEBUG:cachecontrol.controller:Caching due to etag
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/pypi/aenum/3.1.0/json" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /pypi/aenum/3.1.0/json HTTP/1.1" 200 2235
DEBUG:cachecontrol.controller:Updating cache with response from "https://pypi.org/pypi/aenum/3.1.0/json"
DEBUG:cachecontrol.controller:etag object cached for 1209600 seconds
DEBUG:cachecontrol.controller:Caching due to etag
DEBUG:cachecontrol.controller:Looking up "https://pypi.org/pypi/aiobotocore/1.2.2/json" in the cache
DEBUG:cachecontrol.controller:No cache entry available
DEBUG:urllib3.connectionpool:https://pypi.org:443 "GET /pypi/aiobotocore/1.2.2/json HTTP/1.1" 200 5390
DEBUG:cachecontrol.controller:Updating cache with response from "https://pypi.org/pypi/aiobotocore/1.2.2/json"
DEBUG:cachecontrol.controller:etag object cached for 1209600 seconds
DEBUG:cachecontrol.controller:Caching due to etag
Traceback (most recent call last):
  File "/Users/carl.gill/.pyenv/versions/3.7.3/bin/pip-audit", line 8, in <module>
    sys.exit(audit())
  File "/Users/carl.gill/.pyenv/versions/3.7.3/lib/python3.7/site-packages/pip_audit/_cli.py", line 439, in audit
    for (spec, vulns) in auditor.audit(source):
  File "/Users/carl.gill/.pyenv/versions/3.7.3/lib/python3.7/site-packages/pip_audit/_audit.py", line 67, in audit
    for dep, vulns in self._service.query_all(specs):
  File "/Users/carl.gill/.pyenv/versions/3.7.3/lib/python3.7/site-packages/pip_audit/_service/interface.py", line 156, in query_all
    yield self.query(spec)
  File "/Users/carl.gill/.pyenv/versions/3.7.3/lib/python3.7/site-packages/pip_audit/_service/pypi.py", line 101, in query
    f"Mismatched hash for {spec.canonical_name} ({spec.version}): listed "
pip_audit._service.interface.ServiceError: Mismatched hash for aiobotocore (1.2.2): listed 290833e77f3992cf947279106e43f97dc66f72731cf6818d89a171295bda79ba of type sha256 could not be found in PyPI releases```

Issue Analytics

  • State:open
  • Created 9 months ago
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
tetsuo-cppcommented, Dec 13, 2022

I did a bit of digging and it does seem like we’ll need to move the hash checking into the dependency resolution layer. We want to be compatible with alternative indexes like DevPI which don’t support the same JSON API that Warehouse does.

One catch is that the simple endpoint only gives us sha256 whereas pip install allows hashes of type sha256, sha384 and sha512. However, pinning with non-sha256 hashes is already an issue for us because the JSON API only gives us sha256 and md5 (which pip doesn’t allow for pinning).

I’d say the best plan of attack is to move the hash checking into dependency resolution and to skip and provide a warning for non-sha256 hashes (we need to do this regardless).

1reaction
carl-armiscommented, Dec 12, 2022

@woodruffw Feel free to change the title of this to better match the “bug”. Not sure it makes sense as is anymore.

Read more comments on GitHub >

github_iconTop Results From Across the Web

5.5. Hashing — Problem Solving with Algorithms and Data ...
This searching operation is O(1), since a constant amount of time is required to compute the hash value and then index the hash...
Read more >
Hash Indexes - GitLab Docs
Hash Indexes. PostgreSQL supports hash indexes besides the regular B-tree indexes. Hash indexes however are to be avoided at all costs.
Read more >
Improving Query Performance with Indexes using Prisma
One strategy for improving performance for your database queries is using indexes. This article will dive into hash indexes: taking a look ...
Read more >
An Introduction to B-Tree and Hash Indexes in PostgreSQL
This article explores the PostgreSQL implementation of the B-Tree (the B stands for Balanced) and hash index data structures.
Read more >
Hash Collision - an overview | ScienceDirect Topics
A cryptographic hash function takes an arbitrary amount of data as input and returns a fixed-size string as output. The resulting value is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found