Option to only return results that match all tokens
See original GitHub issueFor the purposes of this example you can assume I have set the fuzziness threshold to 0.
Given I have a list of items like this:
var items = [
'large red shirt',
'large green shirt',
'large blue shirt',
'medium red shirt',
'medium green shirt',
'medium blue shirt',
'small red shirt',
'small green shirt',
'small blue shirt',
'large red trousers',
'large green trousers',
'large blue trousers',
'medium red trousers',
'medium green trousers',
'medium blue trousers',
'small red trousers',
'small green trousers',
'small blue trousers',
'large red socks',
'large green socks',
'large blue socks',
'medium red socks',
'medium green socks',
'medium blue socks',
'small red socks',
'small green socks',
'small blue socks'
];
I would like a search of large shirt
to return the 3 results that match both words in the input:
[
'large red shirt',
'large green shirt',
'large blue shirt'
]
A default search returns 0 results.
Enabling thetokenize
option to search for individual words does successfully return these 3 results, however it also returns the other 12 large
or shirt
items. I am only interested in items that match both search tokens.
Could a matchAllTokens
option be added to Fuse to achieve this?
Issue Analytics
- State:
- Created 7 years ago
- Reactions:3
- Comments:7 (4 by maintainers)
Top Results From Across the Web
ElasticSearch - Return only matched token not whole string ...
How can I get list of matching tokens only not whole string when querying index. Say, we have to query a field which...
Read more >How do I build a query such that each token in a document ...
Unfortunately, I need each token of the Store_Name field to be matched. I need the following behavior: Query: Square Steakhouse Result: Match
Read more >Rule-based matching · spaCy Usage Documentation
By default, the matcher will only return the matches and not do anything else, like merge entities or assign labels. This is all...
Read more >Match regular expression (case sensitive) - MATLAB regexp
This MATLAB function returns the starting index of each substring of str that matches the character patterns specified by the regular expression.
Read more >Pattern Matching
All matches (combined with no spaces): Returns all values as a single-valued token. Example: The input 123 456 789 with the pattern [0-9]+...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hey @krisk,
Thanks for taking the time to reply (and for building the library of course!)
I do appreciate that Fuse is all about string approximation, and I am using it for this feature. In my actual implementation I have not set the
threshold
to0
. I have just used that as an example in this ticket because I thought taking fuzziness out of the equation would make this issue clearer.Maybe a more real-world example would better explain what I mean.
Say I have this list of companies, and I want to filter the list down based on a user entered search term:
A search input of “Australia” will return 2 results.
And a search input of “corporate” will return 4 results.
A search input of “Australia corporate”, which in the user’s mind is a more specific search term, will return all 5 results. It seems counter-intuitive for a more specific search term (“Australia corporate”) to return more results than a less specific search term (“Australia”).
I understand that this is useful behaviour in some use-cases, because we may want Fuse to uncover more results even if some of the tokens don’t match. But in this use case, we want to reduce the number of results as more tokens are provided as input.
Here’s a demo: https://jsbin.com/zoposik/4/edit?js,console
Also its not really important, but you may have been mistaken about the
threshold
parameter not having an effect whentokenize
istrue
. On https://jsbin.com/pehixamoba/edit?js,console, this returns 15 results:And this returns 27:
@keeganstreet, your example illustrates the problem quite nicely. All the other matched results might be superfluous.
Very well, I’m sold. I’ll add some logic + option to Fuse.js which would address the issue. I will post updates on this thread.
And about this:
You’re absolutely right. Mea culpa. I had made this change a while ago, and had forgotten about it😅