Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tiny difference in MRR@10 for MS MARCO passage using pyserini.search

See original GitHub issue

hey @qguo96 - here’s what I’m getting:

$ python -m pyserini.search --topics msmarco_passage_dev_subset --index ms-marco-passage --output runs/run.msmarco-passage.2.txt --msmarco --bm25
$ python tools/scripts/msmarco/msmarco_eval.py collections/msmarco-passage/qrels.dev.small.tsv runs/run.msmarco-passage.2.txt
#####################
MRR @10: 0.18741227770955543
QueriesRanked: 6980
#####################

Although note that the MRR@10 here (https://github.com/castorini/anserini/blob/master/docs/experiments-msmarco-passage.md) is:

#####################
MRR @10: 0.18741227770955546
QueriesRanked: 6980
#####################

Note the tiny difference in the final digit. Would you mind looking into this? Diff the actual output from both cases and try to see what’s going on? Just wanted to make sure this wasn’t a bug…

Issue Analytics

State:
Created 3 years ago
Comments:8 (7 by maintainers)

Top GitHub Comments

1reaction

lintoolcommented, Nov 25, 2020

(Sorry to interject!) Since the difference is 3e-17, isn’t the problem rather about the unrounded precision of the eval script, much more than anything else? Differences due to order seem expected (if only, in part, with hindsight!) and they’re in fact smaller than I’d have guessed personally.

Agreed, but for replicability purposes we’d ideally want results to be exactly the same, and if they’re not, we should at least understand why.

0reactions

okhatcommented, Nov 25, 2020

(Sorry to interject!) Since the difference is 3e-17, isn’t the problem rather about the unrounded precision of the eval script, much more than anything else? Differences due to order seem expected (if only, in part, with hindsight!) and they’re in fact smaller than I’d have guessed personally.

Top Results From Across the Web

Jimmy Lin on Twitter: "PSA for those working on MS MARCO ...

PSA for those working on MS MARCO: apparently, shuffling the order of the queries ... Tiny difference in MRR@10 for MS MARCO passage...

Pyserini: A Python Toolkit for Reproducible Information ...

ABSTRACT. Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Pyserini: An Easy-to-Use Python Toolkit to Support Replicable ...

Anserini builds on the open-source Lucene search library and was ... In this section, we will mostly use the MS MARCO passage.

MS MARCO Chameleons - Microsoft

MS MARCO passage retrieval dataset, the performance improve- ments gained over the past two years is impressive. For instance, the best run submitted...

Improving Query Representations for Dense ... - Hang Li

queries), DL HARD, and MS MARCO Passage Ranking V1 Dev set (6,980 queries). For direct comparison with the ANCE-PRF model, we follow the...