Function to get scorers for task
See original GitHub issueI would like to see a utility which would construct a set of applicable scorers for a particular task, returning a Mapping from string to callable scorer. It will be hard to design the API of this right the first time. [Maybe this should be initially developed outside this project and contributed to scikit-learn-contrib, but I think it reduces risk of mis-specifying scorers, so it’s of benefit to this project.]
The user will be able to select a subset of the scorers, either with a dict comprehension or with some specialised methods or function parameters. Initially it wouldn’t be efficient to run all these scorers, but hopefully we can do something to fix #10802 😐.
Let’s take for instance a binary classification task. The function get_applicable_scorers(y, pos_label='yes')
for binary y
might produce something like:
{
'accuracy': make_scorer(accuracy_score),
'balanced_accuracy': make_scorer(balanced_accuracy_score),
'matthews_corrcoef': make_scorer(matthews_corrcoef),
'cohens_kappa': make_scorer(cohens_kappa),
'precision': make_scorer(precision_score, pos_label='yes'),
'recall': make_scorer(recall_score, pos_label='yes'),
'f1': make_scorer(f1_score, pos_label='yes'),
'f0.5': make_scorer(f1_score, pos_label='yes', beta=0.5),
'f2': make_scorer(f1_score, pos_label='yes', beta=2),
'specificity': ...,
'miss_rate': ...,
...
'roc_auc': make_scorer(roc_auc_score, needs_threshold=True),
'average_precision': make_scorer(average_precision_score, needs_threshold=True),
'neg_log_loss': make_scorer(average_precision_score, needs_proba=True, greater_is_better=False),
'neg_brier_score_loss': make_scorer(average_precision_score, needs_proba=True, greater_is_better=False),
}
Doing the same for multiclass classification would pass labels
as appropriate, and would optionally would get per-class binary metrics, as well as overall multiclass metrics.
I’m not sure how sample_weight
fits in here, but ha! we still don’t support weighted scoring in cross validation (#1574), so let’s not worry about that.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:8 (8 by maintainers)
Top GitHub Comments
Yes, you’re right, the estimator might be useful to determine
predict_proba
orreturn_std
support.I suppose that most likely you will need the model too. Having the model itself you would be able to exclude/include scorers.