question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Insidious fastutil, FeatureVector, and RM3 bug: massive regression impact!

See original GitHub issue

I was trying to upgrade fastutil from version 6.5.6 (an ancient version from Jun 14, 2013) to the latest, version 8.3.0, when I came across a really insidious multi-part bug. The tl;dr is that there’s a bug in RM3, which will affect all regressions. Here’s the full story:

The class FeatureVector is built around the fastutil Object2FloatOpenHashMap class, which is used by the RM3 implementation to estimate relevance models. In the current implementation, when estimating the relevance model for the feedback docs, we truncate each individual feedback document:

docVector.pruneToSize(fbTerms);

This is the first part of the bug. Just because we ultimately want to select fbTerms terms for feedback doesn’t mean that we should only consider fbTerms terms from each document. This was probably done for performance reasons, although query latency really isn’t affected. I checked: on my iMac Pro, query latency doesn’t increase with that line removed.

Now this leads to the second part of the bug: the method pruneToSize sorts the features by weight, but it doesn’t consistently perform tie breaking. This means tie breaking is implementation specific, which means that the fastutil upgrade changed the tie-breaking behavior, which means that different terms are selected from documents, which changes the results.

Insert face plam here.

So to fix this, we need to:

  1. Not prune selection from individual docs.
  2. To prevent future issues along these lines, implement consistent tie-breaking behavior in the FeatureVector implementation.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
lintoolcommented, Dec 10, 2019

Okay, here are the results, on Robust04:

AP Paper 1 Paper 2
BM25+RM3 (default) 0.2903 0.2903
BM25+RM3 (default): fixed 0.2920 0.2920
BM25+RM3 (tuned) 0.3043 0.3021
BM25+RM3 (tuned): fixed 0.3004 0.2989

Note that the tuned “fixed” results use the old parameter settings, without retuning.

cf: https://github.com/castorini/anserini/blob/master/docs/experiments-forum2018.md

For the record, these are the commands:

python src/main/python/fine_tuning/reconstruct_robus04_tuned_run.py \
 --index lucene-index.robust04.pos+docvectors+rawdocs \
 --folds src/main/resources/fine_tuning/robust04-paper1-folds.json \
 --params src/main/resources/fine_tuning/params/params.map.robust04-paper1-folds.bm25+rm3.json \
 --output run.robust04.bm25+rm3.paper1.txt


python src/main/python/fine_tuning/reconstruct_robus04_tuned_run.py \
 --index lucene-index.robust04.pos+docvectors+rawdocs \
 --folds src/main/resources/fine_tuning/robust04-paper2-folds.json \
 --params src/main/resources/fine_tuning/params/params.map.robust04-paper2-folds.bm25+rm3.json \
 --output run.robust04.bm25+rm3.paper2.txt


eval/trec_eval.9.0.4/trec_eval src/main/resources/topics-and-qrels/qrels.robust04.txt run.robust04.bm25+rm3.paper1.txt

eval/trec_eval.9.0.4/trec_eval src/main/resources/topics-and-qrels/qrels.robust04.txt run.robust04.bm25+rm3.paper2.txt
2reactions
daltonjcommented, Dec 10, 2019

It doesn’t seem right to not fix a bug because it would change numbers. Isn’t this the correct, desired outcome of a bug fix? Fix the bug, update the tests…? It doesn’t seem right to use / cite an RM3 implementation that is incorrect…?

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Refresher on Regression Analysis - Harvard Business Review
Regression analysis is a way of mathematically sorting out which of those variables does indeed have an impact. It answers the questions: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found