question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add sklearn.metrics.cumulative_gain_curve and sklearn.metrics.lift_curve

See original GitHub issue

Description

I recently added plot_cumulative_gain and plot_lift_curve methods to https://github.com/reiinakano/scikit-plot. To do this, I built an adhoc version of cumulative_gain_curve closely following the sklearn.metrics.roc_curve interface at https://github.com/reiinakano/scikit-plot/blob/master/scikitplot/helpers.py#L157. Let me know if sklearn.metrics.cumulative_gain_curve is something you’d be interested in adding into scikit-learn. I could add example docs for plotting gain and lift curves as well.

Reference I followed for lift and gain: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_23.0.0/spss/tutorials/mlp_bankloan_outputtype_02.html

plot_cumulative_gain

plot_lift_curve

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:5
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
GuillemGSubiescommented, Jul 18, 2019

Any progress here? An intuitive explanation of the lift curve can be found here:

http://www2.cs.uregina.ca/~dbd/cs831/notes/lift_chart/lift_chart.html

It is like “how much better than the random model I am doing at each percentile”

1reaction
lorentzenchrcommented, Oct 12, 2022

TLDR

+1 for inclusion of the gain curve/CAP. Naming should reflect different strands of literature: cumulative accuracy profile (CAP) [2][4], concentration curve [3], cumulative lift curve [5]. It should work for binary classification as well as regression (models for the expectation).

Some more background

The cumulative gains curve is the same as the Cumulative Accuracy Profile (CAP), see [1] and [4]. From [2]

Moody’s uses Cumulative Accuracy Profiles (CAP), to make visual, qualitative assessments of model performance. While similar tools exist under a variety of different names (lift-curves, dubbed-curves, receiver-operator curves, power curves, etc.).

References: [1] Tasche 2006 “Validation of internal rating systems and PD estimates” https://arxiv.org/pdf/physics/0606071.pdf [2] Soběhart, J.R., Keenan, S.C., & Stein, R.M. (2000). “Benchmarking Quantitative Default Risk Models: A Validation Methodology” [3] Denuit, M., Trufin, J. (2021). “Lorenz curve, Gini coefficient, and Tweedie dominance for autocalibrated predictors” https://dial.uclouvain.be/pr/boreal/object/boreal%3A254535/datastream/PDF_01/view [4] https://www.listendata.com/2019/09/gini-cumulative-accuracy-profile-auc.html [5] Ling C, Li C (1998). “Data Mining for Direct Marketing: Problems and solutions.” In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 73–79.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Metrics Module (API Reference) — Scikit-plot documentation
The scikitplot.metrics module includes plots for machine learning ... it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances.
Read more >
3.3. Metrics and scoring: quantifying the quality of predictions
This is discussed in the section The scoring parameter: defining model evaluation rules. Metric functions: The sklearn.metrics module implements functions ...
Read more >
sklearn.metrics.auc — scikit-learn 1.2.0 documentation
Examples using sklearn.metrics.auc: Species distribution modeling Species distribution modeling Poisson regression and non-normal loss Poisson regression ...
Read more >
sklearn.metrics.dcg_score — scikit-learn 1.2.0 documentation
This ranking metric yields a high value if true labels are ranked high by y_score . Usually the Normalized Discounted Cumulative Gain (NDCG, ......
Read more >
sklearn.metrics.DistanceMetric
This class provides a uniform interface to fast distance metric functions. The various metrics can ... from sklearn.metrics import DistanceMetric >>> dist ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found