Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add common metrics for information retrieval

See original GitHub issue

There’s a nice summary of common metrics for IR here: https://www.pinecone.io/learn/offline-evaluation/

Although we have many of these as part of trec_eval, it could make sense to have separate metrics like MRR and NDCG@K to give visibility

Issue Analytics

State:
Created a year ago
Comments:5 (3 by maintainers)

Top GitHub Comments

3reactions

ola13commented, Sep 22, 2022

ICT - Inverse Close Task, e.g. as defined here https://arxiv.org/pdf/1906.00300.pdf

Yes I think what would be more easy for my use-case would be things like Recall / Precision @ n etc, I can take a stab at those then 😃

2reactions

ola13commented, Sep 22, 2022

Hey @cakiki are you still interested in working on this?

I’m thinking about IR metrics in the context of building the retriever for ROOTS - I want a simple way to test the quality of retrievers I’m building and I’m thinking about the following approach right now:

choose a sample of all documents from ROOTS - similar to https://huggingface.co/datasets/bigscience-data/roots_1e-1 but with two differences - I would be choosing a sample of passages as I define them in my index (I split long documents into multiple passages) and I want the sample to be smaller.
split each passage from the sample into sentences (potential challenge in some languages)
create a test set using ICT-style approach - select a random sentence from a passage and use it as a query to retrieve that passage (in my simple test set approach I don’t necessarily want to remove the sentence from the passage)

Then I could be evaluating using huggingface/evaluate IR metrics.

What do you think? I’m also looking for better ideas for creating test sets for ROOTS retrieval 😃

Top Results From Across the Web

Evaluation Metrics For Information Retrieval - Amit Chaudhary

Learn about common metrics used to evaluate performance of information retrieval systems.

Evaluation Measures in Information Retrieval - Pinecone

How to measure retrieval performance with offline metrics like recall@K, MRR, MAP@K, and NDCG@K.

Evaluation measures (information retrieval) - Wikipedia

Evaluation measures for an information retrieval (IR) system assess how well an index, ... Offline metricsEdit. Offline metrics are generally created from relevance...

Evaluation measures (information retrieval) - Wikiwand

Offline metrics are generally created from relevance judgment sessions where the judges score the quality of the search results. Both binary (relevant/non- ...

Evaluation in information retrieval - Stanford NLP Group

R-precision ad- justs for the size of the set of relevant documents: A perfect system could score 1 on this metric for each...