question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

Question about implementation of double hashing

See original GitHub issue

Hi. Iā€™m just looking into the hashing implementation here and Iā€™m baffled by several aspects of the implementation, hoping you can tell me if thereā€™s anything Iā€™m missing.

For example, in getIndices, youā€™re rehashing the input on every iteration of the loop: https://github.com/Callidon/bloom-filters/blob/5c81fa4054465f446e3bb1606ddeceffdb907d81/src/utils.ts#L206-L209

Surely this defeats the point of double hashing, to simulate k independent hash functions given only two real ones? Not using double hashing at all would need to do one hash per index, your implementation does two hashes per index.

Itā€™s true that the hashes youā€™re calculating on each loop arenā€™t quite the same, because youā€™re adding size % i ā€“ thatā€™s the number of cells modulo the loop iteration ā€“ to the seed each time. But why? That seems like a really strange thing to add to the seed. It doesnā€™t guarantee that the seed is different on each loop (eg if the number of cells is even itā€™ll be 0 for the first 2 iterations). But again thatā€™s not something you should want/need anyway in double hashing.

Am I missing something?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:7

github_iconTop GitHub Comments

1reaction
SimonWoolfcommented, Dec 3, 2021

Do you accept a merge of your work here? (see last commit, authorship added)

Sure, go for it

0reactions
folkvircommented, Dec 2, 2021

Ahah šŸ‘ we worked exactly on the same thing but in different ways, and Iā€™m actually surprised by yours. I agree, this will almost never happen because the number of hash functions is not very high in practice. I mean, I never see someone setting a hashCount of 1000. But just in case it will work. Do you accept a merge of your work here? (see last commit, authorship added)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Double Hashing - GeeksforGeeks
The advantage of Double hashing is that it is one of the best forms of probing, producing a uniform distribution of records throughout...
Read more >
Double Hashing Formula Explained - Scaler Topics
Double hashing is used for avoiding collisions in hash tables. This technique is simplified with easy to follow examples and hands onĀ ...
Read more >
Double Hashing Questions and Answers - Sanfoundry
1. Double hashing is one of the best methods available for open addressing. Ā· 2. What is the hash function used in Double...
Read more >
Double Hashing in Java - Javatpoint
In programming, while we deal with data structure sometimes, we required to store two objects having the same hash value. Storing two objects...
Read more >
What is double hashing? - Educative.io
Double hashing is a technique used for avoiding collisions in hash tables. A collision occurs when two keys are hashed to the same...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found