Make methods in IndexReaderUtils more consistent re: Analyzer
See original GitHub issueWe have:
# Pass in a no-op analyzer:
analyzer = pyanalysis.get_lucene_analyzer(stemming=False, stopwords=False)
index_utils.get_term_counts(term, analyzer=analyzer)
df, cf = index_utils.get_term_counts(term)
Here, we take an analyzer
.
And:
# Fetch and traverse postings for an analyzed term:
postings_list = index_utils.get_postings_list(analyzed[0], analyze=False)
for posting in postings_list:
print(f'docid={posting.docid}, tf={posting.tf}, pos={posting.positions}')
Here, we take a bool
. Let’s make both consistent?
How about both take analyzer
and accepts None
? Passing in a “no-op” analyzer seems a bit janky.
Thoughts? @PepijnBoers @Chriskamphuis
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
IndexWriter (Lucene 7.4.0 API)
Each method that changes the index returns a long sequence number, which expresses the effective order in which each change was applied. commit()...
Read more >anserini/IndexReaderUtils.java at master
* Computes the score of a document with respect to a query given a scoring function and an analyzer. *. * @param reader...
Read more >Lucene 4 Essentials for Text Search and Indexing
Most of this post is excerpted from Text Processing in Java, Chapter 7, ... Documents are indexed via the IndexWriter 's default Analyzer...
Read more >org.apache.lucene.index.IndexWriter (Java2HTML)
There are 55 also <a href="#IndexWriter(org.apache.lucene.store. ... Analyzer)"><b>constructors</b></a> 56 with no <code>create</code> argument which 57 ...
Read more >Lucene Version 3.0 Tutorial
by relevancy with documents most similar to the query having the highest ... These methods are all specified in the Fieldable interface.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
you can just do:
where you import that function from
pyanalysis
Looks good, but then we have to specify
default
somewhere, otherwise we face aNameError
. The question would then also be how/where to define default, right?