question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ms_marco document / Bi-encoder evaluate

See original GitHub issue

Hi! Thanks for your work in this repo! I was trying to reproduce bi/cross encoder in ms marco dataset. However, there are some questions that confuse me:

  1. I saw that you mentioned in the description that the MRR@10 of the passage retrieval task of the msmarco dataset is around 30-36, but when I use distillroberta-base for training, I can easily reach 37-39, even if the negative of each query is 1000. So where is the problem? Or should I use eval_msmarco.py from the example?
  2. Have you experimented with bi/cross encoder in document-level tasks? If so, can you share the results of the experiment? I want to verify that my changes are correct. In addition, do you have a recommended method for document-level tasks?
  3. When I tried to train the bi-encoder, I observed that the provided example MRR can easily reach 42+ or even 70+ (passage retrieval), but when I changed to document retrieval, it quickly reached 20+, and then MRR was It oscillates around 20-22, so are these two results correct? If I want to evaluate the performance of bi-encoder separately, what should I do?

All the code I use comes from sentence-transformers/examples/training/ms_marco

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, May 14, 2021

Hi,

  1. Yes, you should eval_msmarco.py. During training, scores are computed on a small subset with a smaller corpus (so that the scores can be computed more quickly). For comparison, you must run it on all the 8 million passages.

  2. I have not done experiments yet. What works well is to split your documents into paragraphs, either by e.g. identifying paragraphs (like 1 / 2 blank lines) or by splitting your doc into e.g. 100 word chunks. Then encode them individually and do standard passage retrieval. Later you can map it back to the document, i.e., from which doc is this passage => show the doc to the user

  3. I have no experiences with doc retrieval. For MRR of 70+: Yes, this is normal as during training a tiny corpus is used for quicker evaluation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MS MARCO — Sentence-Transformers documentation
MS MARCO Passage Ranking is a large dataset to train models for information retrieval. It consists of about 500k real search queries from...
Read more >
MS MARCO - Microsoft Open Source
date type MRR@100 (Dev) MRR@100 (Eval) 2022/02/08 🏆 full ranking 0.512 0.446 2021/07/14 🏆 full ranking 0.500 0.440 2021/06/24 🏆 full ranking 0.496 0.436
Read more >
Multilingual Information Retrieval (MS-Marco Bi-Encoders) #695
My Question / Kind Request We currently use your new Bi-Encoder ... Is the MS MARCO dataset useful for training and evaluation: Yes, ......
Read more >
cross-encoder/msmarco-MiniLM-L6-en-de-v1 - Hugging Face
This is a cross-lingual Cross-Encoder model for EN-DE that can be used for passage re-ranking. It was trained on the MS Marco Passage...
Read more >
Semi-Siamese Bi-encoder Neural Ranking Model Using ...
Between the two, bi-encoder is highly efficient because all the documents can be ... metrics are evaluated over Robust04, ClueWeb09b, and MS-MARCO datasets....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found