MS MARCO passage regression errors: BM25prf gives non-deterministic results
See original GitHub issueHi @emmileaf I’m getting these MS MARCO passage regression errors:
This is on tuna
:
2019-08-11 03:55:02,107 - regression_test - ERROR - !!!!!{"actual": 0.1518, "collection": "msmarco-passage", "expected": 0.152, "metric": "map", "model": "bm25-default+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
...
2019-08-11 03:56:41,141 - regression_test - ERROR - !!!!!{"actual": 0.1579, "collection": "msmarco-passage", "expected": 0.1582, "metric": "map", "model": "bm25-tuned+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
This is on another machine:
2019-08-11 03:38:45,575 - regression_test - ERROR - !!!!!{"actual": 0.1519, "collection": "msmarco-passage", "expected": 0.152, "metric": "map", "model": "bm25-default+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
...
2019-08-11 03:39:51,630 - regression_test - ERROR - !!!!!{"actual": 0.158, "collection": "msmarco-passage", "expected": 0.1582, "metric": "map", "model": "bm25-tuned+prf", "topic": "[MS MARCO Passage Ranking: Dev Queries](https://github.com/microsoft/MSMARCO-Passage-Ranking)"}!!!!!
It seems like BM25prf gives non-deterministic results?
@matthew-z any ideas?
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
anserini/regressions-msmarco-passage-doc2query.md at ...
This page documents regression experiments on the MS MARCO passage ranking task with BM25 on (vanilla) doc2query (also called doc2query-base) expansions, ...
Read more >MS MARCO - Microsoft Open Source
date type MRR@100 (Dev) MRR@100 (Eval)
2022/02/08 🏆 full ranking 0.512 0.446
2021/07/14 🏆 full ranking 0.500 0.440
2021/06/24 🏆 full ranking 0.496 0.436
Read more >MS MARCO: Benchmarking Ranking Modelsin the Large-Data ...
We first describe an older improvement, where signif- icantly better rankers such as BM25 were developed using TREC data in the 1990s. We...
Read more >Document Expansions and Learned Sparse Lexical ...
The MS MARCO passage ranking corpus comprises 8.8m passages drawn from texts shown as “answers” at the top of Bing's search results page....
Read more >MS MARCO — Sentence-Transformers documentation
MS MARCO Passage Ranking is a large dataset to train models for ... By default, we set a threshold of 3: If the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
From a quick look at the runs, this might be a score tie handling issue?
https://github.com/castorini/anserini/blob/5b29d1654abc5e8a014c2230da990ab2f91fb340/src/main/java/io/anserini/rerank/lib/BM25PrfReranker.java#L113
https://github.com/castorini/anserini/blob/5b29d1654abc5e8a014c2230da990ab2f91fb340/src/main/java/io/anserini/rerank/lib/Rm3Reranker.java#L111-L118
I’ll add in tiebreak handling similar to the other rerankers, and update regression numbers for PRF.
Two trials on
tuna
(Java 8) give the same result. Two trials on my iMac Pro (Java 8) gives the same result.I think it’s just the case that we forgot to update the regression values.
See PR #788