Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

*SearchCV should warn if calculating train score is unduly expensive

See original GitHub issue

I suspect we made a mistake in letting return_train_score default to True (see #9619 for example) in GridSearchCV, as it can sometimes be expensive to score over a training set.

I think it would be useful to users if we issue a warning (in _fit_and_score? at the end of fitting?) if train scoring is greater than ?10% of fit and test score time and is more than a few seconds for the entire grid search… or something.

It’s hard to come up with a precise heuristic, and it’s hard to know when to issue it.

The other option is to make return_train_score=True stop being the default…

Issue Analytics

State:
Created 6 years ago
Comments:11 (11 by maintainers)

Top GitHub Comments

1reaction

jnothmancommented, Sep 2, 2017

No, we can’t print before scoring the training data. I’m not really expecting users to stop the process if it’s taking a long time, but to understand that it is lengthened by the option, and to change the option in the future. so even warning after all fits are done may suffice.

On 1 Sep 2017 10:18 pm, “Kumar Ashutosh” notifications@github.com wrote:

@amueller https://github.com/amueller We need to find print a warning before scoring training data, right? And 5s after starting an estimator, right? I am a bit confused. And I agree with the idea of changing the default to “warn”.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scikit-learn/scikit-learn/issues/9621#issuecomment-326565897, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz64uGztniCDXiW3O8YxjvNxCVokY3ks5sd_YlgaJpZM4PBQC2 .

0reactions

thechargedneutroncommented, Sep 1, 2017

@amueller We need to find print a warning before scoring training data, right? And 5s after starting an estimator, right? I am a bit confused. And I agree with the idea of changing the default to “warn”.