Missing results within edit distance in fuzzy search
See original GitHub issueHello, the issue https://github.com/olivernn/lunr.js/issues/375 seems to be still occurring in some cases. Fuzzy search, even with stemming disabled, misses some words within the given edit distance. In particular, it seems to miss terms that are equal to the search term plus a suffix as long as the maximum edit distance.
In other words, searching for abc~2
correctly matches abcx
, but misses abcxx
. Searching for abc~3
correctly matches abcx
and abcxx
, but misses abcxxx
.
Here is a fiddle reproducing the issue on Lunr v3.2.5
(latest release at the moment of posting): https://jsfiddle.net/3ajf72Ly/1/
I know that https://github.com/olivernn/lunr.js/pull/382 addressed this already, and it did fix some cases, but not all.
This is also visible from the expansion of lunr.TokenSet.fromFuzzyString("abc", 2).toArray()
, where abc**
is missing:
**abc, **bc, **c, *a*bc, *a*c, *ab, *ab*, *ab*c, *abc, *abc*,
*ac, *acb, *b, *b*, *b*c, *bac, *bc, *bc*, *c, *cb, a*, a**, a**bc,
a**c, a*b, a*b*, a*b*c, a*bc, a*bc*, a*c, a*c*, a*cb, ab, ab*,
ab**, ab**c, ab*c, ab*c*, abc, abc*, ac, ac*, ac*b, acb, acb*, b,
b*, b*ac, b*c, ba, ba*, ba*c, bac, bac*, bc, bc*, bca
Thanks again for the great library!
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
I’ve just pushed a fix for this in 2.3.6. Thanks again for taking the time to investigate and report the bug!
That’s annoying! Thanks for reporting, I’ll put together some tests and then figure out how to get a fix out.