Feature Request: bias regression metric
See original GitHub issueFollowing on from https://github.com/scikit-learn/scikit-learn/issues/17853
I’m interested in the mean distribution of the error which i’m calling bias (over or under predicting).
I can do this as a one liner in numpy (np.average(y_pred - y_true)
) but I would prefer to stay in scikit-learn.
Describe the workflow you want to enable
bias(y_true, y_pred)
Describe your proposed solution
It has mostly been implemented in https://github.com/scikit-learn/scikit-learn/blob/fd237278e/sklearn/metrics/_regression.py#L181
Would just have to adjust
output_errors = np.average(np.abs(y_pred - y_true),
weights=sample_weight, axis=0)
to
output_errors = np.average(y_pred - y_true,
weights=sample_weight, axis=0)
Describe alternatives you’ve considered, if relevant
Additional context
Discussion of whether this is an error metric or not at https://github.com/scikit-learn/scikit-learn/issues/17853#issuecomment-654429486
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Evaluation of six methods for correcting bias in estimates from ...
Ensemble-tree machine learning (ML) regression models can be prone to systematic bias: small values are overestimated and large values are underestimated.
Read more >Evaluation metrics & Model Selection in Linear Regression
Residual plots expose a biased model than any other evaluation metric. If your residual plots look normal, go ahead, and evaluate your model ......
Read more >Evaluation Metrics for Your Regression Model - Analytics Vidhya
Regression ; Why we require Evaluation Metrics ... cent efficiency otherwise the model is known as a biased model. which further includes the ......
Read more >Part 6: Bias, Variance and Error Metrics | by Ryan Gotesman
Conversely, adding more complex features usually improves models with high bias but not high variance. These 2 graphs can serve as an excellent ......
Read more >How To Analyze The Performance of Regression Models in ...
To identify the segments of interest, one should look for the difference in feature values between the MAJORITY group and OVER or UNDER...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
First of all, a one liner like
np.average(y_pred - y_true)
is not a good fit for inclusion. This one liner will need tests and documentation, thus a lot of maintenance burden. And in users’ code, it will replace a one liner by another one liner. So where is the point? Therefore, I’m -1.The other point is that, in my opinion, assessing bias or calibration is a very important topic for ML models, but it seems a bad fit for the metrics module because there you typically have scores for model comparison. Scores have an ordering (e.g. larger is better). Calibration, in general, is different (e.g. sign sensitive) because you want to know if your model over- or underpredicts. Otherwise stated, scores can be optimised, calibration is more like the first order optimality condition (the derivative).
Currently, we don’t have good solutions to detect bias except for the reliability diagram/calibration curve for binary classification, further discussions e.g. in #18020.
Again, in my opinion, something that would be a good fit for the metrics module is #23767.
Hi @lorentzenchr
Personally I +1 the comment by @thomasjpfan in #17853 that this isn’t a good metric. Has there been consensus on whether it is desired to be introduced in scikit-learn or not?