Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature Request: bias regression metric

See original GitHub issue

Following on from https://github.com/scikit-learn/scikit-learn/issues/17853

I’m interested in the mean distribution of the error which i’m calling bias (over or under predicting). I can do this as a one liner in numpy (np.average(y_pred - y_true)) but I would prefer to stay in scikit-learn.

Describe the workflow you want to enable

bias(y_true, y_pred)

Describe your proposed solution

It has mostly been implemented in https://github.com/scikit-learn/scikit-learn/blob/fd237278e/sklearn/metrics/_regression.py#L181

Would just have to adjust

    output_errors = np.average(np.abs(y_pred - y_true),
                               weights=sample_weight, axis=0)

    output_errors = np.average(y_pred - y_true,
                               weights=sample_weight, axis=0)

Describe alternatives you’ve considered, if relevant

Additional context

Discussion of whether this is an error metric or not at https://github.com/scikit-learn/scikit-learn/issues/17853#issuecomment-654429486

Issue Analytics

State:
Created 3 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

lorentzenchrcommented, Oct 15, 2022

First of all, a one liner like np.average(y_pred - y_true) is not a good fit for inclusion. This one liner will need tests and documentation, thus a lot of maintenance burden. And in users’ code, it will replace a one liner by another one liner. So where is the point? Therefore, I’m -1.

The other point is that, in my opinion, assessing bias or calibration is a very important topic for ML models, but it seems a bad fit for the metrics module because there you typically have scores for model comparison. Scores have an ordering (e.g. larger is better). Calibration, in general, is different (e.g. sign sensitive) because you want to know if your model over- or underpredicts. Otherwise stated, scores can be optimised, calibration is more like the first order optimality condition (the derivative).

Currently, we don’t have good solutions to detect bias except for the reliability diagram/calibration curve for binary classification, further discussions e.g. in #18020.

Again, in my opinion, something that would be a good fit for the metrics module is #23767.

0reactions

Kshitij68commented, Oct 15, 2022

Hi @lorentzenchr

Personally I +1 the comment by @thomasjpfan in #17853 that this isn’t a good metric. Has there been consensus on whether it is desired to be introduced in scikit-learn or not?

Top Results From Across the Web

Evaluation of six methods for correcting bias in estimates from ...

Ensemble-tree machine learning (ML) regression models can be prone to systematic bias: small values are overestimated and large values are underestimated.

Evaluation metrics & Model Selection in Linear Regression

Residual plots expose a biased model than any other evaluation metric. If your residual plots look normal, go ahead, and evaluate your model ......

Evaluation Metrics for Your Regression Model - Analytics Vidhya

Regression ; Why we require Evaluation Metrics ... cent efficiency otherwise the model is known as a biased model. which further includes the ......

Part 6: Bias, Variance and Error Metrics | by Ryan Gotesman

Conversely, adding more complex features usually improves models with high bias but not high variance. These 2 graphs can serve as an excellent ......

How To Analyze The Performance of Regression Models in ...

To identify the segments of interest, one should look for the difference in feature values between the MAJORITY group and OVER or UNDER...