[Feature Request] custom fusion method in optimize_fusionSee original GitHub issue
Is your feature request related to a problem? Please describe. Hi, you’ve done a great job implementing plenty of different fusion algorithms, but I think it will always be a bottleneck. What would you think about letting the user define their own training function?
Describe the solution you’d like
For example, in optimize_fusion, allow
method to be a
callable and in this case, do not call
Describe alternatives you’ve considered
- Open a feature request every time I want to try out something new 😃
- Fork ranx and implement new fusion methods there
My use case/ Ma et al. By the way, at the moment, my use case is to use the default-minimum trick of Ma et al.: when combining results from systems A and B, it consists in giving the minimum score of A’s results if a given document was only retrieved by system B, and vice-versa.
Maybe this is already possible in ranx via some option/method named differently? Or maybe you’d like to add it in the core ranx fusion algorithms?
- Created 2 months ago
- Comments:14 (6 by maintainers)
Top GitHub Comments
I never used
ZMUV, to be honest. I implemented it for completeness and tried it for comparison purposes but never got better results than
sum, which sometimes works the best.
In general, I prefer local normalization schemes because they are “unsupervised” and can be used out of the box.
Without strong empirical evidence that
default-minimum (w/ or w/o
ZMUV) works better than
sum, I would not use it.
Also, without a standardized way of normalizing/fusing results is often difficult to understand what brings improvements over the state-of-the-art. Conducting in-depth ablation studies is costly, and we often lack enough space on conference papers to write about them.
Thank you very much, Paul!
I am happy to see that
To give you some context, I added/invented
max norm because the minimum score is often unknown.
We usually fuse only the top retrieved documents from each model, which makes
min-max (in this specific context) not very sound to me.
I did not do extensive experimentation but from my experience
max norm outperforms
min-max very often.