Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

label_ranking_average_precision_score: sample_weighting isn't applied to items with zero true labels

See original GitHub issue

Description

label_ranking_average_precision_score offers a sample_weighting argument to allow nonuniform contribution of individual samples to the reported metric. Separately, individual samples whose labels are the same for all classes (all true or all false) are treated as a special case (precision == 1, line 732). However, this special case bypasses the application of sample_weight (line 740). So, in the case where there is both non-default sample_weighting and samples with, for instance, zero labels, the reported metric is wrong.

Steps/Code to Reproduce

See example in this colab

import numpy as np
import sklearn.metrics

# Per sample APs are 0.5, 0.75, and 1.0 (default for zero labels).
truth = np.array([[1, 0, 0, 0], [1, 0, 0, 1], [0, 0, 0, 0]], dtype=np.bool)
scores = np.array([[0.3, 0.4, 0.2, 0.1], [0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1]])
print(sklearn.metrics.label_ranking_average_precision_score(
    truth, scores, sample_weight=[1.0, 1.0, 0.0]))

Expected Results

Average of AP of first and second samples = 0.625

Actual Results

Sum of AP of all three samples, divided by sum of weighting vector = 2.25/2 = 1.125

Versions

System: python: 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0] executable: /usr/bin/python3 machine: Linux-4.14.79±x86_64-with-Ubuntu-18.04-bionic

BLAS: macros: SCIPY_MKL_H=None, HAVE_CBLAS=None lib_dirs: /usr/local/lib cblas_libs: mkl_rt, pthread

Python deps: pip: 19.0.3 setuptools: 40.8.0 sklearn: 0.20.3 numpy: 1.14.6 scipy: 1.1.0 Cython: 0.29.6 pandas: 0.22.0

Issue Analytics

State:
Created 5 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

2reactions

dpwecommented, Mar 11, 2019

I don’t think this is quite the fix we want. The edge case is not that the sample weight is zero (I just used that in the example to make the impact easy to see). The problem is to account for any kind of non-default sample weight in the case of constant labels for all classes.

I’ll work on a solution and some tests.

0reactions

AbhishekBabujicommented, Mar 14, 2019

Bookmarking this. As a first timer, this is super helpful. From this, I understand which file I’m supposed to be writing test cases in and the fact that it should be robust. It really helps also drive the point home about writing code in such a way that very minimal changes need to be made to EXTEND the code. I can see how 4 lines of code help achieve this because of the REST of the code is written overall.

I should develop the patience the will power to work out the math formula for this listed in the documentation. Had I done that, I would’ve written more meaningful code. I’ll spend some time and try to go over why your lines make sense.

Thanks a ton for pinging this here and giving me a notification.

Top Results From Across the Web

label_ranking_average_precisio...

label_ranking_average_precision_score averages over samples, not labels #10840 ... The existing code handles zero-true-label samples sanely, ...

How to interpret: Label Ranking Average Precision Score

I am new to Array programming and found it difficult to interpret the sklearn.metrics label_ranking_average_precision_score function. Need your ...

Do We Really Need Gold Samples for Sample Weighting ...

Many early meth- ods use heuristics to compute sample weights. However, a pre-determined weighting scheme is not very effective and cannot leverage real...

FAQ: The svy command's handling of zero weights - Stata

Commands used without svy ignore any observations with zero weights. You can see the number of observations reported is different. Here's an example...

Tools of the trade: when to use those sample weights

As is true for much in life and in research, there is no simple prescriptive rule for the use of sample weights. The...