Use modifiable global random state in tests
See original GitHub issueAs mentioned by @jnothman in https://github.com/scikit-learn/scikit-learn/issues/13846#issuecomment-494175027
Relatedly, I proposed having a random_seed fixture that was globally set to different values on different testing runs. One benefit would be that we could easily distinguish those tests that are invariant under changing random seed from those that are brittle.
I think it would be a good idea. For instance, we could,
-  create a global 
auto-usepytest fixture inscikit-learn/conftest.py,@pytest.fixture(scope="session") def pytest_rng(): random_seed = os.environ.get('SKLEARN_TEST_RNG', 42) return np.random.RandomState(random_seed) -  modify tests to use it, e.g.
- def test_something(): + def test_something(pytest_rng): - rng = np.random.RandomState(0) - est = Estimator(random_state=rng) + est = Estimator(random_state=pytest_rng) 
One issue is that global auto-use fixtures are a bit magical, but I’m hoping that naming it as pytest_rng it would be explicit enough.
Edit: updated to avoid using an auto-use fixture.
Issue Analytics
- State:
 - Created 4 years ago
 - Comments:9 (9 by maintainers)
 
Top Results From Across the Web
Do not modify global random state · Issue #39716 - GitHub
I would like to propose that instead all functions which need a random source accept a local, non-global, random_seed / random_state argument to ......
Read more >python - What is "random-state" in sklearn.model_selection ...
Random state ensures that the splits that you generate are reproducible. Scikit-learn uses random permutations to generate the splits ...
Read more >Why ML model produces different results despite ...
Given that sklearn does not have its own global random seed but uses the numpy random seed we can set it globally with...
Read more >Why do we set a random state in machine learning models?
The random state hyperparameter is used to control the randomness involved in machine learning models. We can use cross-validation to mitigate the effect...
Read more >Legacy Random Generation — NumPy v1.25.dev0 Manual
RandomState adds additional information to the state which is required when using Box-Muller normals since these are produced in pairs.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

The idea would be to specifically identify tests that are robust to changes of random state.\
In progress in https://github.com/scikit-learn/scikit-learn/pull/22749