question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

label_ranking_average_precision_score: sample_weighting isn't applied to items with zero true labels

See original GitHub issue

Description

label_ranking_average_precision_score offers a sample_weighting argument to allow nonuniform contribution of individual samples to the reported metric. Separately, individual samples whose labels are the same for all classes (all true or all false) are treated as a special case (precision == 1, line 732). However, this special case bypasses the application of sample_weight (line 740). So, in the case where there is both non-default sample_weighting and samples with, for instance, zero labels, the reported metric is wrong.

Steps/Code to Reproduce

See example in this colab

import numpy as np
import sklearn.metrics

# Per sample APs are 0.5, 0.75, and 1.0 (default for zero labels).
truth = np.array([[1, 0, 0, 0], [1, 0, 0, 1], [0, 0, 0, 0]], dtype=np.bool)
scores = np.array([[0.3, 0.4, 0.2, 0.1], [0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1]])
print(sklearn.metrics.label_ranking_average_precision_score(
    truth, scores, sample_weight=[1.0, 1.0, 0.0]))

Expected Results

Average of AP of first and second samples = 0.625

Actual Results

Sum of AP of all three samples, divided by sum of weighting vector = 2.25/2 = 1.125

Versions

System: python: 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0] executable: /usr/bin/python3 machine: Linux-4.14.79±x86_64-with-Ubuntu-18.04-bionic

BLAS: macros: SCIPY_MKL_H=None, HAVE_CBLAS=None lib_dirs: /usr/local/lib cblas_libs: mkl_rt, pthread

Python deps: pip: 19.0.3 setuptools: 40.8.0 sklearn: 0.20.3 numpy: 1.14.6 scipy: 1.1.0 Cython: 0.29.6 pandas: 0.22.0

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
dpwecommented, Mar 11, 2019

I don’t think this is quite the fix we want. The edge case is not that the sample weight is zero (I just used that in the example to make the impact easy to see). The problem is to account for any kind of non-default sample weight in the case of constant labels for all classes.

I’ll work on a solution and some tests.

0reactions
AbhishekBabujicommented, Mar 14, 2019

Bookmarking this. As a first timer, this is super helpful. From this, I understand which file I’m supposed to be writing test cases in and the fact that it should be robust. It really helps also drive the point home about writing code in such a way that very minimal changes need to be made to EXTEND the code. I can see how 4 lines of code help achieve this because of the REST of the code is written overall.

I should develop the patience the will power to work out the math formula for this listed in the documentation. Had I done that, I would’ve written more meaningful code. I’ll spend some time and try to go over why your lines make sense.

Thanks a ton for pinging this here and giving me a notification.

Read more comments on GitHub >

github_iconTop Results From Across the Web

label_ranking_average_precisio...
label_ranking_average_precision_score averages over samples, not labels #10840 ... The existing code handles zero-true-label samples sanely, ...
Read more >
How to interpret: Label Ranking Average Precision Score
I am new to Array programming and found it difficult to interpret the sklearn.metrics label_ranking_average_precision_score function. Need your ...
Read more >
Do We Really Need Gold Samples for Sample Weighting ...
Many early meth- ods use heuristics to compute sample weights. However, a pre-determined weighting scheme is not very effective and cannot leverage real...
Read more >
FAQ: The svy command's handling of zero weights - Stata
Commands used without svy ignore any observations with zero weights. You can see the number of observations reported is different. Here's an example...
Read more >
Tools of the trade: when to use those sample weights
As is true for much in life and in research, there is no simple prescriptive rule for the use of sample weights. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found