Build Sparse Index for NQ
See original GitHub issueHi 😃
First, many thanks for this awesome library!
I wanted to ask - what is the command to build the sparse index over DPR’s data psgs_w100.tsv
(i.e., reproduce your wikipedia-dpr
index)?
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (5 by maintainers)
Top Results From Across the Web
Sparse Indexes — MongoDB Manual
An index that is both sparse and unique prevents collection from having documents with duplicate values for a field but allows multiple documents...
Read more >SQL Sparse Indexing for Db2 z/OS - IDUG
The inner table is read first applying any local predicates and Db2 building the sparse index. The sparse index result from the inner...
Read more >Sparse data structures — pandas 0.18.1 documentation
SparseArray is the base layer for all of the sparse indexed data structures. It is a 1-dimensional ndarray-like object storing only values distinct...
Read more >Sparse indexes - IBM
An SQL sparse index is like a select/omit access path. Both the sparse index and the select/omit logical file contain only keys that...
Read more >Indexing - UiO
However, sparse indexes must access the data block to see if a record exists ... Can we build a dense, second level index...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
-storeContents
stores the document vector representations in the index, which enable relevance feedback.Here’s the “model card” for the index: https://git.uwaterloo.ca/jimmylin/anserini-indexes/-/blob/master/index-wikipedia-dpr-20210120-d1b9e6-readme.txt
If you are able to perform retrieval and get the same level of effectiveness, then you should be good.
@oriram could you double check if you added
-storeRaw
option? We built the index with raw collection stored in the index. The raw collection should be than 3.5G.Without storing anything additional things (i.e. wtihout storeRaw storeVector etc.) the pure index will be around 2.5 G