question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add `return_predictions` option to the model_selection.cross_validate() API

See original GitHub issue

The cross_validate(), cross_val_predict(), and cross_val_score() methods from the model_selection module provide highly compact and convenient API for most daily works. Yet often times, one would like to get 1) multiple scores for both train and test, and 2) predictions for all samples using cross validation, which is an aggregation of test predictions from all folds.

Current API doesn’t seem to allow both goals to be achieved in a single cv run. Is it possible to add something like a return_predictions option to the model_selection.cross_validate() API, as in here, to simply allow access to the results (which I believe have already been computed inside the function)?

Thank you!

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:5
  • Comments:23 (22 by maintainers)

github_iconTop GitHub Comments

1reaction
amuellercommented, Aug 6, 2019

I don’t really like returning the indices and models. You’re still requiring the user to reimplement quite a bit, why wouldn’t they just completely reimplement it as @mostafahadian did?

I would favor adding a return_predictions (and possibly predict_method) to cross-validate, which would provide per-fold predictions. I think the user can be trusted to hstack them.

[after reading the above more carefully: as usual I agree with @jnothman]

1reaction
jnothmancommented, May 22, 2019

I think wanting to get predictions as well as scores is a common use-case, albeit open to misuse. To some extent I’d rather have return_predictions which would return test set predictions as well as indices (under a separate key) rather than return_includes. One reason is that there’s no harm including CV indices if returning predictions, and it might help guide the users towards reasonable usage patterns. One reason is that some things should be returned without being asked for, and multi-metric scoring naturally expands the set of returned keys. One reason is that it’s better for return_train_score to be available consistent with *SearchCV.

Read more comments on GitHub >

github_iconTop Results From Across the Web

3.1. Cross-validation: evaluating estimator performance
Here is a flowchart of typical cross validation workflow in model training. The best parameters can be determined by grid search techniques.
Read more >
3.1. Cross-validation: evaluating estimator performance
The simplest way to use cross-validation is to call the cross_val_score helper function on the estimator and the dataset. >>> from sklearn. model_selection...
Read more >
How to use a cross-validated model for prediction?
With the help of CV, you can assess hyperparameters and compare different models to each other. It's just an alternative to a train/test...
Read more >
Cross Validation and Grid Search for Model Selection in Python
One such factor is the performance on cross validation set and another other factor is the choice of parameters for an algorithm.
Read more >
Nested Cross-Validation for Machine Learning with Python
The k-fold cross-validation procedure is used to estimate the performance of machine learning models when making predictions on data not ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found