Add sklearn.metrics.cumulative_gain_curve and sklearn.metrics.lift_curve
See original GitHub issueDescription
I recently added plot_cumulative_gain
and plot_lift_curve
methods to https://github.com/reiinakano/scikit-plot. To do this, I built an adhoc version of cumulative_gain_curve
closely following the sklearn.metrics.roc_curve
interface at https://github.com/reiinakano/scikit-plot/blob/master/scikitplot/helpers.py#L157. Let me know if sklearn.metrics.cumulative_gain_curve
is something you’d be interested in adding into scikit-learn. I could add example docs for plotting gain and lift curves as well.
Reference I followed for lift and gain: https://www.ibm.com/support/knowledgecenter/en/SSLVMB_23.0.0/spss/tutorials/mlp_bankloan_outputtype_02.html
Issue Analytics
- State:
- Created 6 years ago
- Reactions:5
- Comments:7 (6 by maintainers)
Top Results From Across the Web
Metrics Module (API Reference) — Scikit-plot documentation
The scikitplot.metrics module includes plots for machine learning ... it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances.
Read more >3.3. Metrics and scoring: quantifying the quality of predictions
This is discussed in the section The scoring parameter: defining model evaluation rules. Metric functions: The sklearn.metrics module implements functions ...
Read more >sklearn.metrics.auc — scikit-learn 1.2.0 documentation
Examples using sklearn.metrics.auc: Species distribution modeling Species distribution modeling Poisson regression and non-normal loss Poisson regression ...
Read more >sklearn.metrics.dcg_score — scikit-learn 1.2.0 documentation
This ranking metric yields a high value if true labels are ranked high by y_score . Usually the Normalized Discounted Cumulative Gain (NDCG, ......
Read more >sklearn.metrics.DistanceMetric
This class provides a uniform interface to fast distance metric functions. The various metrics can ... from sklearn.metrics import DistanceMetric >>> dist ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Any progress here? An intuitive explanation of the lift curve can be found here:
http://www2.cs.uregina.ca/~dbd/cs831/notes/lift_chart/lift_chart.html
It is like “how much better than the random model I am doing at each percentile”
TLDR
+1 for inclusion of the gain curve/CAP. Naming should reflect different strands of literature: cumulative accuracy profile (CAP) [2][4], concentration curve [3], cumulative lift curve [5]. It should work for binary classification as well as regression (models for the expectation).
Some more background
The cumulative gains curve is the same as the Cumulative Accuracy Profile (CAP), see [1] and [4]. From [2]
References: [1] Tasche 2006 “Validation of internal rating systems and PD estimates” https://arxiv.org/pdf/physics/0606071.pdf [2] Soběhart, J.R., Keenan, S.C., & Stein, R.M. (2000). “Benchmarking Quantitative Default Risk Models: A Validation Methodology” [3] Denuit, M., Trufin, J. (2021). “Lorenz curve, Gini coefficient, and Tweedie dominance for autocalibrated predictors” https://dial.uclouvain.be/pr/boreal/object/boreal%3A254535/datastream/PDF_01/view [4] https://www.listendata.com/2019/09/gini-cumulative-accuracy-profile-auc.html [5] Ling C, Li C (1998). “Data Mining for Direct Marketing: Problems and solutions.” In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 73–79.