Enforce positive sample_weight
See original GitHub issueAs discussed in https://github.com/scikit-learn/scikit-learn/issues/12464#issuecomment-433815773 and https://github.com/scikit-learn/scikit-learn/issues/15358#issuecomment-549048650 in some specific use-cases negative sample_weight
may be meaningful, however in most cases they should never happen.
So we may want to
- add
force_positive=None
parameter to_check_sample_weights
. - add a
assume_positive_sample_weights=True
config parameter tosklearn.set_config
/get_config
.
By default, force_positive=None
would error on negative sample weight, but this check could be disabled globally with sklearn.set_config(assume_positive_sample_weights=False)
.
With _check_sample_weights(.., force_positive=True)
the check would always be done irrespective of the config parameter.
If there are no objections to this, please tag this issue as “Help Wanted”.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:14 (14 by maintainers)
Top Results From Across the Web
Weight only the negative or the positive samples in the class
I am using Keras to implement a deep learning problem with the mentioned algorithm. How can i assign only negative or positive instances...
Read more >XGBoost regressor sample weight has negligible impact on ...
Summary: I added sample weights in XGBoost training and the weights have a good dynamic range from 0.001 to above 6.
Read more >1.11. Ensemble methods — scikit-learn 1.2.0 documentation
This is an array with shape (n_features,) whose values are positive and sum to ... constraints cannot be used to enforce the following...
Read more >FSIS Directive 10010.1 Revision 4
A. This directive provides instructions to inspection program personnel (IPP) for collecting and submitting samples of raw beef products ...
Read more >azureml.train.automl.automlconfig.AutoMLConfig class
The positive class label that Automated Machine Learning will use to ... Whether to enforce a time limit on model training at each...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I think the config still makes sense since negative sample weights don’t make sense in the vast majority of applications. You really want to know what you’re doing when using negative SW.
Whether estimators can support them is another thing, independent of the application.
I think, zero should always be allowed as a valid value for sample weights, but sum(sw) > 0 should hold.
For our dear hep friends, who want negative sw, often in linear models, they are implemented via scaling X and y by sqrt(sw). This makes supporting negative values difficult.