question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add common metrics for information retrieval

See original GitHub issue

There’s a nice summary of common metrics for IR here: https://www.pinecone.io/learn/offline-evaluation/

Although we have many of these as part of trec_eval, it could make sense to have separate metrics like MRR and NDCG@K to give visibility

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
ola13commented, Sep 22, 2022

ICT - Inverse Close Task, e.g. as defined here https://arxiv.org/pdf/1906.00300.pdf

Yes I think what would be more easy for my use-case would be things like Recall / Precision @ n etc, I can take a stab at those then 😃

2reactions
ola13commented, Sep 22, 2022

Hey @cakiki are you still interested in working on this?

I’m thinking about IR metrics in the context of building the retriever for ROOTS - I want a simple way to test the quality of retrievers I’m building and I’m thinking about the following approach right now:

  • choose a sample of all documents from ROOTS - similar to https://huggingface.co/datasets/bigscience-data/roots_1e-1 but with two differences - I would be choosing a sample of passages as I define them in my index (I split long documents into multiple passages) and I want the sample to be smaller.
  • split each passage from the sample into sentences (potential challenge in some languages)
  • create a test set using ICT-style approach - select a random sentence from a passage and use it as a query to retrieve that passage (in my simple test set approach I don’t necessarily want to remove the sentence from the passage)

Then I could be evaluating using huggingface/evaluate IR metrics.

What do you think? I’m also looking for better ideas for creating test sets for ROOTS retrieval 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Evaluation Metrics For Information Retrieval - Amit Chaudhary
Learn about common metrics used to evaluate performance of information retrieval systems.
Read more >
Evaluation Measures in Information Retrieval - Pinecone
How to measure retrieval performance with offline metrics like recall@K, MRR, MAP@K, and NDCG@K.
Read more >
Evaluation measures (information retrieval) - Wikipedia
Evaluation measures for an information retrieval (IR) system assess how well an index, ... Offline metricsEdit. Offline metrics are generally created from relevance...
Read more >
Evaluation measures (information retrieval) - Wikiwand
Offline metrics are generally created from relevance judgment sessions where the judges score the quality of the search results. Both binary (relevant/non- ...
Read more >
Evaluation in information retrieval - Stanford NLP Group
R-precision ad- justs for the size of the set of relevant documents: A perfect system could score 1 on this metric for each...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found