Separate train and test prediction/scoring inside BaseSearchCV
See original GitHub issueDescribe the workflow you want to enable
Separate train and test scoring methods in BaseSearchCV
.
Describe your proposed solution
Currently, in BaseSearchCV
, train and test scoring is done inside of _fit_and_score
. This does not allow separate behaviour for train and test. Instead, one could move the process predicting on the train set and test set into separate methods of BaseSearchCV
, allowing easy subclassing.
Describe alternatives you’ve considered, if relevant
None
Additional context
Use case for this would be that when one wants to have different behaviour of pipelines in train vs test, they could easily subclass BaseSearchCV
and override the train_score
and test_score
methods, first setting some relevant parameter to enable different behaviour for train or tests, then performing the original scoring method. You could then subclass any BaseSearchCV
subclasses to inherit from this new class and pass on said behaviour.
In terms of changing current sklearn behaviour, there would be none, but it would save a lot of work and code for anyone who wishes to extend current BaseSearchCV
subclasses.
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (16 by maintainers)
OK, thanks for your thoughts. Re it being only a partial solution of a specific use case, yep makes sense and I completely understand the resistance. My only counter offer would be that I’m happy to do the work for the PR, but understand this still means you guys need to go through the review process.
As for pipelines being stage-aware, although its not on your roadmap, I think its something that should be, especially as compute increases and the use of things like stacking do with it. I’d be interested in being involved in/starting the discussion should there be one.
Thanks for your time!
Yes, I think this need to use a generated dataset, effectively, at training time, is handled by the resampling functionality. Similar to using kfold y in stacking.