question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issues with negative values in sample_weight

See original GitHub issue

Description

I am not sure what the interpretation of a negative value in sample_weight might be and why this should be supported, but I believe that there should be constraints in using non-negative values in several cases; the use of negative ones can lead to some very strange results.

See an example below for r2_score where the use of negative weights yields a value larger than one, which really does not make sense.

Steps/Code to Reproduce

import numpy as np

from sklearn.metrics import r2_score

np.random.seed(seed=2)
x = np.random.randn(100,)
y = x + 0.3*np.random.randn(*x.shape)
w = np.random.randn(*x.shape)

r2_score(x, y, sample_weight=w)

Expected Results

Something smaller or equal to 1.0

Actual Results

1.1919195778883198

Versions

System: python: 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] executable: C:\Users\nak142\Miniconda3\envs\sklearn_contrib\pythonw.exe machine: Windows-10-10.0.17134-SP0

BLAS: macros: SCIPY_MKL_H=None, HAVE_CBLAS=None lib_dirs: C:/Users/nak142/Miniconda3/envs/sklearn_contrib\Library\lib cblas_libs: mkl_rt

Python deps: pip: 10.0.1 setuptools: 40.0.0 sklearn: 0.21.dev0 numpy: 1.15.0 scipy: 1.1.0 Cython: 0.28.5 pandas: None

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
alexpearcecommented, Oct 29, 2018

Thanks for the ping @amueller.

I still think the feature is very useful. Samples with negative weights play a critical role in the area of high energy physics that I work in. But one does need to understand that one can easily arrive at nonsensical results when using them, so having a warning/error is probably a good sanity check for many use-cases.

I think a global flag is a nice idea, as then the user has to admit “I think I know what I’m doing” and has to interpret any results accordingly.

0reactions
kmqandacommented, Mar 28, 2022

Would like to check the status of this open item, having a global flag enabling negative sample weight is helpful in my use case. Alternatively, I uses a direct implementation to allow -ve weights:

xw = x * w
xwx = np.dot(xw.T, x)
xwx_inv = np.linalg.inv(xwx)
coef = np.dot(xwx_inv,np.dot(xw.T,y))
xf = np.dot(x, coef)
res = y - xf
res_wss = np.dot(w.T, (res ** 2))
tot_wss = np.dot(w.T, (y ** 2))
wr2 = 1 - res_wss / tot_wss
Read more comments on GitHub >

github_iconTop Results From Across the Web

Negative kriging weight... What's that? - LinkedIn
Negative kriging weights are usually associated to clusters of data points, for example downhole sampling, and may produce artifacts in ...
Read more >
Weighted average problem with negative numbers in data set
Hi, Let's say I have 4 values A, B, C and D. Let's say they each contribute 30%, 30%, 20% and 20% towards...
Read more >
Weighted sampling without replacement and negative weights
Rejection sampling is worth a try. Compute the maximum weight of a sample (max of the abs of each of the k least...
Read more >
Scale Displaying Negative Values - HBI Technologies
Negative numbers being displayed is caused by removing a tray or container from the weighing platform after the scale has been tared for...
Read more >
Mplus Discussion >> Confusing error message about ...
0 should be allowed as a value for a sampling weight. ... on the weight variable do not contain any missing, zero, or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found