question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TST Check correct interactions of `class_weight` and `sample_weight`

See original GitHub issue

In scikit-learn, some estimators support class_weight and sample_weight.

It might be worth testing the correct interaction of those two types of weights, especially asserting that:

  • setting a class weights to zero is equivalent to excluding the samples associated to this class from the calibration even when using non uniform sample weights.
  • setting some samples weights to zero is equivalent to excluding those samples from the calibration even when if they are associated to using non uniform class weights.

Relevant interfaces:

  • the main subclasses of sklearn.tree.BaseDecisionTree for classification, i.e.:
    • sklearn.tree.DecisionTreeClassifier
    • sklearn.tree.ExtraTreeClassifier
  • the main subclasses of sklearn.ensemble.BaseForest for classification and embedding, i.e.:
    • sklearn.ensemble.RandomTreesEmbedding
    • sklearn.ensemble.RandomForestClassifier
    • sklearn.ensemble.ExtraTreesClassifier
  • sklearn.linear_model.LogisticRegression
  • sklearn.linear_model.LogisticRegressionCV
  • sklearn.CalibratedClassifierCV after the merge of #17541

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
mlantcommented, Nov 4, 2021

Thank you, I think it’s clear. I’ll begin tomorow, if I have some issues I’ll come back to you.

1reaction
rthcommented, Nov 4, 2021

I think we could probably directly add this as a common tests (and maybe skip estimators that fail at first)? It’s very similar in spirit to https://github.com/scikit-learn/scikit-learn/blob/46485a93beccd79d0e3563512e8385b3e5667524/sklearn/utils/estimator_checks.py#L972 so I imagine we could add a similar def check_class_weights_invariance(name, estimator_orig) and only run it here if the estimator has the “class_weight” init parameter.

Thanks for raising this @jjerphan !

Read more comments on GitHub >

github_iconTop Results From Across the Web

How can correct sample_weight in sklearn.naive_bayes?
The sample_weight and class_weight are two different things. As their name suggests: sample_weight is to be applied to individual samples ...
Read more >
python - difference between sample_weight and class_weight ...
So: The sample weights exist to change the importance of data-points whereas the class weights change the weights to correct class imbalance.
Read more >
How To Dealing With Imbalanced Classes in Machine Learning
Learn how to deal with imbalanced classes in machine learning by improving the class imbalance using Python and improve your model.
Read more >
Why Weight? The Importance of Training on Balanced Datasets
Calculate sample weights. Balanced class weights can be automatically calculated within the sample weight function. Set class_weight = 'balanced ...
Read more >
Beating Naive Bayes at Taxonomic Classification of 16S rRNA ...
Class weight information can be utilized by a variety of supervised ... To test whether reducing the classification confidence threshold ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found