Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parameter searches: folds used in evaluate_candidates aren't consistent across calls

See original GitHub issue

If I use a custom CV iterator with suffle=True, random_state=None, then different folds will be generated for each call to evaluate_candidates().

This isn’t an issue for GridSearchCV or RandomizedSearchCV since these only call evaluate_candidates() once. But this is fundamentally wrong for e.g. Successive Halving #13900, which repeatedly call evaluate_candidates but assumes the folds to be always the same (if resource != n_samples). This is even more of an issue if we implement warm start for SH: estimators would not be warmstarted on the same data.

ping in particular @jnothman , I think this is something you noted before (but can’t remember where, sorry)?

Issue Analytics

State:
Created 4 years ago
Comments:16 (16 by maintainers)

Top GitHub Comments

1reaction

thomasjpfancommented, Oct 10, 2019

I would prefer storing the seed. The RandomState.get_state() tuple is quite big (compared to a seed). numpy reference

0reactions

GaelVaroquauxcommented, Oct 30, 2019

That would work too, though I don’t think we should prevent users from using a RandomState instance or None. It’s convenient to just define an instance at the top of your file and pass it down to each call, e.g. random_state=rng.

I would not mind erroring on this with a good error message.

Top Results From Across the Web

3.2. Tuning the hyper-parameters of an estimator - Scikit-learn

Hyper-parameters are parameters that are not directly learnt within estimators. In scikit-learn they are passed as arguments to the constructor of the ...

5 Model Training and Tuning | The caret Package - Github Sites

5.1 Model Training and Parameter Tuning. The caret package has several functions that attempt to streamline the model building and evaluation process.

Hyperparameter Tuning with Grid Search and Random Search

If we decide to use cross validation (let's say with 5 folds) this means grid search will have to evaluate 1200 (=240*5) model...

Recruiting Metrics Cheat Sheet - LinkedIn Business

Recruiting is a form of selling. You're reaching out to candidates through phone calls and emails to entice them to consider working for...

Model evaluation, model selection, and algorithm selection in ...

And in contrast to the repeated holdout method, which we discussed in Part II, test folds in k-fold cross-validation are not overlapping. In ......