searching in a field for an exact phrase containing spaces
See original GitHub issueI have tags on all my web pages, and have built the lunr index with the tags field. Some tags are multiword phrases. If I search for tags:hand drawn maps
I get completely wrong result. However, if I search for tags:hand*
I get the correct result. I have tried tags:+hand +drawn +maps
and tags:"hand drawn maps"
but without any success. Suggestions?
update: I should add that I make my index and conduct my query like so
idx = lunr(function () {
this.field('title', { boost: 10 }),
this.field('tags'),
this.field('body'), { boost: 20 },
this.field('created'),
this.ref('file'),
pages.forEach(function (doc) {
this.add(doc)
}, this)
});
searchResult = idx.search(q).map(function(result) {
return {
ref : result.ref,
disp : result.ref.replace(/-/g, ' ')
}
});
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
How to match exact Search in Solr with space - java
You need to use the string as fieldtype for your field programLocation . Apply the String as fieldtype and reindex the data.
Read more >searching for an exact phrase containing a white space ...
Doing a global search on a self-managed instance for a string containing a white space will reveal hits that contain the strings that...
Read more >Exact Phrase Match on a not_analyzed field with a space in ...
I'm trying to boost a query that has an exact match on a field (in this case a name field). I've looked on...
Read more >Excel - Exacting Text String or Phrase Query Search?
"...you can do two searches, first for " eat" (space, eat), then for "eat " (eat, space). That should find all occurrences of...
Read more >Outlook 365 search putting spaces in my "exact" search term
Yes, to search for an exact string, you must use quotation marks. When testing on my side, search results return correctly.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@icidasset yeah, sorry I didn’t explain that well, and the relevant document is a bit hidden.
Lunr uses a
lunr.Token
internally, these are mostly just wrappers around a string.lunr.tokenizer
is used to create lists of tokens from the fields of the documents being indexed.If a field is an array, Lunr assumes that the items in the array are already ‘tokens’, for everything else it assumes it has to split the field into tokens itself. It does this splitting on whitespace (among other characters).
So, when you pass an
['foo bar baz']
Lunr assumes you have already done the splitting and the token is"foo bar baz"
, when you just pass"foo bar baz"
it does the splitting it self and you get"foo", "bar", "baz"
.When performing a search using
lunr.Index#search
the query string you use is parsed into alunr.Query
object. By default the parser assumes that whitespace indicates a term boundary, that is a search for “foo bar” is interpreted as a search for the termsfoo
andbar
. What you want is a search for the termfoo bar
.You can bypass the query parsing entirely by using the
lunr.Index#query
method which allows you to specify the term exactly how you want. Alternatively you can continue usinglunr.Index#search
but escape the spaces in the query string. Finally you could use wildcards to match the spaces, though this will match any character so is probably not a great solution.I have put together a fiddle showing the three approaches. I think using
lunr.Index#query
is probably the best choice.