question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

searching in a field for an exact phrase containing spaces

See original GitHub issue

I have tags on all my web pages, and have built the lunr index with the tags field. Some tags are multiword phrases. If I search for tags:hand drawn maps I get completely wrong result. However, if I search for tags:hand* I get the correct result. I have tried tags:+hand +drawn +maps and tags:"hand drawn maps" but without any success. Suggestions?

update: I should add that I make my index and conduct my query like so

    idx = lunr(function () {
        this.field('title', { boost: 10 }),
        this.field('tags'),
        this.field('body'), { boost: 20 },
        this.field('created'),
        this.ref('file'),

        pages.forEach(function (doc) {
            this.add(doc)
        }, this)
    });

    searchResult = idx.search(q).map(function(result) {
            return {
                ref : result.ref,
                disp : result.ref.replace(/-/g, ' ')
            }
        });

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

8reactions
olivernncommented, Aug 6, 2018

@icidasset yeah, sorry I didn’t explain that well, and the relevant document is a bit hidden.

Lunr uses a lunr.Token internally, these are mostly just wrappers around a string. lunr.tokenizer is used to create lists of tokens from the fields of the documents being indexed.

If a field is an array, Lunr assumes that the items in the array are already ‘tokens’, for everything else it assumes it has to split the field into tokens itself. It does this splitting on whitespace (among other characters).

So, when you pass an ['foo bar baz'] Lunr assumes you have already done the splitting and the token is "foo bar baz", when you just pass "foo bar baz" it does the splitting it self and you get "foo", "bar", "baz".

6reactions
olivernncommented, Aug 1, 2018

When performing a search using lunr.Index#search the query string you use is parsed into a lunr.Query object. By default the parser assumes that whitespace indicates a term boundary, that is a search for “foo bar” is interpreted as a search for the terms foo and bar. What you want is a search for the term foo bar.

You can bypass the query parsing entirely by using the lunr.Index#query method which allows you to specify the term exactly how you want. Alternatively you can continue using lunr.Index#search but escape the spaces in the query string. Finally you could use wildcards to match the spaces, though this will match any character so is probably not a great solution.

I have put together a fiddle showing the three approaches. I think using lunr.Index#query is probably the best choice.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to match exact Search in Solr with space - java
You need to use the string as fieldtype for your field programLocation . Apply the String as fieldtype and reindex the data.
Read more >
searching for an exact phrase containing a white space ...
Doing a global search on a self-managed instance for a string containing a white space will reveal hits that contain the strings that...
Read more >
Exact Phrase Match on a not_analyzed field with a space in ...
I'm trying to boost a query that has an exact match on a field (in this case a name field). I've looked on...
Read more >
Excel - Exacting Text String or Phrase Query Search?
"...you can do two searches, first for " eat" (space, eat), then for "eat " (eat, space). That should find all occurrences of...
Read more >
Outlook 365 search putting spaces in my "exact" search term
Yes, to search for an exact string, you must use quotation marks. When testing on my side, search results return correctly.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found