False positive matches when pattern.length > 32
See original GitHub issueI’ve noticed that when the length of my search pattern is > 32 fuse matches everything in my dataset. Interestingly, if I “include matches” there aren’t actually any indices in the resulting matches array.
Here’s a live demo: http://codepen.io/mgalgs/pen/jyROXg
Also pasted here:
document.addEventListener("DOMContentLoaded", function(event) {
var list = [{
text: "pizza"
}, {
text: "feast"
}];
var options = {
include: ["score", "matches"],
shouldSort: true,
threshold: 0.5,
location: 0,
distance: 0,
maxPatternLength: 50,
minMatchCharLength: 4,
keys: [
"text"
]
};
var body = document.getElementsByTagName('body')[0];
body.innerHTML = '<h3>Fuse <b>always</b> matches when pattern length > 32</h3>';
var fuse = new Fuse(list, options); // "list" is the item array
var patterns = [];
var i;
for (i = 0; i < 40; ++i) {
patterns.push('w'.repeat(i));
}
patterns = patterns.concat(["i like pie", "123456789112345678921234567890123", "pizza",
"feast", "stuff is good", "pizzza", "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
]);
for (i = 0; i < patterns.length; ++i) {
var pattern = patterns[i];
var matched = fuse.search(pattern).length > 0;
body.innerHTML += 'Pattern (len=' + pattern.length + ') ' + pattern + (matched ? ' matched' : " didn't match") + '<br>';
}
});
Results in:
Fuse always matches when pattern length > 32
Pattern (len=0) didn't match
Pattern (len=1) w didn't match
Pattern (len=2) ww didn't match
Pattern (len=3) www didn't match
Pattern (len=4) wwww didn't match
Pattern (len=5) wwwww didn't match
Pattern (len=6) wwwwww didn't match
Pattern (len=7) wwwwwww didn't match
Pattern (len=8) wwwwwwww didn't match
Pattern (len=9) wwwwwwwww didn't match
Pattern (len=10) wwwwwwwwww didn't match
Pattern (len=11) wwwwwwwwwww didn't match
Pattern (len=12) wwwwwwwwwwww didn't match
Pattern (len=13) wwwwwwwwwwwww didn't match
Pattern (len=14) wwwwwwwwwwwwww didn't match
Pattern (len=15) wwwwwwwwwwwwwww didn't match
Pattern (len=16) wwwwwwwwwwwwwwww didn't match
Pattern (len=17) wwwwwwwwwwwwwwwww didn't match
Pattern (len=18) wwwwwwwwwwwwwwwwww didn't match
Pattern (len=19) wwwwwwwwwwwwwwwwwww didn't match
Pattern (len=20) wwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=21) wwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=22) wwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=23) wwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=24) wwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=25) wwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=26) wwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=27) wwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=28) wwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=29) wwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=30) wwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=31) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=32) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww didn't match
Pattern (len=33) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=34) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=35) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=36) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=37) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=38) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=39) wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww matched
Pattern (len=10) i like pie didn't match
Pattern (len=33) 123456789112345678921234567890123 matched
Pattern (len=5) pizza matched
Pattern (len=5) feast matched
Pattern (len=13) stuff is good didn't match
Pattern (len=6) pizzza matched
Pattern (len=33) bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb matched
Issue Analytics
- State:
- Created 7 years ago
- Reactions:6
- Comments:10 (2 by maintainers)
Top Results From Across the Web
identifying and documenting false positive - IUPUI ScholarWorks
We show the practicality of using a standardized test suite to identify false positive patterns. We design a filter for identifying false positive...
Read more >False occurrences of functional motifs in protein sequences ...
The false positive rate of a pattern on a large protein database can be estimated from the number of matches expected to occur...
Read more >Detecting and avoiding likely false‐positive findings – a ...
We explain why the proportion of published false-positive findings is expected to increase with (i) decreasing sample size, (ii) increasing ...
Read more >Bloom filter - Wikipedia
False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely...
Read more >Everything you need to know about Regular Expressions
Whenever you want to check whether a string is a valid zip code, you can match it against the pattern. You'll get a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Not sure how you would want to handle this but the issue lies in
bitap_search.js
in the following lineconst mask = 1 << (patternLen - 1)
Since JS converts values to int32 for all bitwise operations
patternLen > 30
overflows. I’ll be looking more into this later on.I have this same issue here. That much time has passed and the bug is not fixed, or am I missing something?