Using invertedIndex for autocomplete
See original GitHub issueNot so much an issue with lunr, which is great! More a quick try to get ideas going…
In a shell with the jq utility I pull my terms from the lunr index in advance:
jq '[.index.invertedIndex[][0]|scan("^\\w{3,}")]|unique' index.json > iindex.json
I can feed that to http://api.jqueryui.com/autocomplete/ widget like below
function normalize(str) {
var map = { "ä": "a", "ö": "o", "ü": "u", "ß": "ss" };
return str.replace(/[^A-Za-z0-9]/g,
function(a) { return map[a]||a; }
);
}
$.getJSON('iindex.json', function (tags) {
$('#query').autocomplete({
minLength: 3,
source: function(inp, out) {
var t = normalize(inp.term);
var r = $.ui.autocomplete.filter(tags, t);
out(r);
}
});
});
Not fully nice, but works acceptably so far. Now truly nice would be to create the autocomplete index on the client and have the term to match processed by the indexer instead of that crude normalizer above.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
A detailed comparison between autocompletion strategies in ...
With Elasticsearch's inverted index, this is fairly straightforward — return all documents that have android in the “platform” field.
Read more >The Awesome Power of the Inverted Index - Lucidworks
The inverted index is a wonder that helps find and make sense of information buried in mounds of data, text and binaries.
Read more >Understanding the Inverted Index in Elasticsearch
The purpose of an inverted index, is to store text in a structure that allows for very efficient and fast full-text searches. When...
Read more >How to structure an index for type ahead for extremely large ...
I've used this data structure for the exact auto-complete ... to tag each indexed record with a relevance score, which you can then...
Read more >Autocomplete with Elasticsearch - Part 2: Index-Time Search ...
The inverted index needs to store more data. We highly recommended reading the Definitive Guide, as there are additional examples, e.g. for zip ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
First, a HUGE thanks to both @hungerburg and @olivernn for this. I combined both suggestions and it’s working great. For anyone wanting to do the same, this is what worked for me…
I’m indexing like this:
And modified the autocomplete function suggested by @hungerburg to use the unstemmed words like this:
Thanks!
Sorry for the late reply.
You could definitely wrap that normalise function up into a lunr plugin. There is a similar project, lunr-unicode-normalizer, but I don’t think it has been updated for lunr 2.
As for autocomplete, I need to get round to actually putting a demo of this together, but this is what I’ve been suggesting to people.
I disable the pipeline to prevent stemming getting in the way, you would have to experiment if this makes sense for your use case, especially if you wanted to add the unicode normalising plugin.
Additionally, when using the
query
method lunr won’t be doing any tokenisation for you, you can either handle this your self, or borrow thelunr.tokenizer
directly, or its regex to split into individual tokens.