question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pkg_resources.Distribution.hashcmp is slow

See original GitHub issue

As it removes md5 fragment on the fly, pkg_resources.Distribution.hashcmp is slow:

@property
def hashcmp(self):
    return (
        self.parsed_version,
        self.precedence,
        self.key,
        _remove_md5_fragment(self.location),
        self.py_version or '',
        self.platform or '',
    )

Would there be side effects if location is changed into a property where the setter would not only store the location but also store the location without fragment ?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
gotchacommented, May 12, 2020

The simple example below calls hashcmp (thus _remove_md5_fragment) nearly 1.3 million times:

import pkg_resources
import setuptools.package_index

pki = setuptools.package_index.PackageIndex()
req = pkg_resources.Requirement.parse('setuptools')
pki.obtain(req)

I guess that setuptools is the most patholigical case. Nevertheless, trying to obtain django calls hashcmp more than 200k times.

1reaction
gotchacommented, May 13, 2020

It is always better to remove code when possible. Thanks for #2108

To answer your question above about real-world usage, I am in the process of removing zc.buildout dependency on setuptools.easy_install.

In a first phase, I plan to keep the dependency on setuptools.package_index. Is there some doc that presents how quick it will be deprecated ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

HashMap with 8 million entries becomes slow - java
First thing is to check the distribution of hashcode. First check that, but with a minor change. The hashcode of a Key inside...
Read more >
Optimizing HashMap's Performance - Baeldung
A quick and practical introduction to HashMap performance optimization. ... a hashing function: quality of produced hash codes and speed.
Read more >
5 Mundane Java Performance Tips | Richard Startin's Blog
Size HashMaps whenever possible​​ Even if most of its operations are quite fast, resizing HashMap s is slow and hard to optimise, so...
Read more >
Things every engineer should know: Hashmaps - Medium
As none of these operations get slower the larger the hashmap grows, the overall operation of inserting into a hashmap also does not...
Read more >
Is rust HashMap slower than go's? - Reddit
The Rust HashMap has a default hash function that is designed to prevent HashDoS attacks at the expense of some performance. Try using...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found