DOC: Recommended best practices for passing around a random bit generator
See original GitHub issueIf I develop a module that uses random numbers from numpy.random, and follow the recommendation to use numpy.random.default_rng() to create a generator, e.g. in the module’s initialization code, then every time the module is freshly imported I get a new sequence of random numbers.
Now it is often useful, for example for running exactly the same simulation several times, to initialize the generator to a specific state, and this creation function provides that possibility. I can do that for testing purposes during development, but in the final version I would of course use the form in which “fresh entropy” is used.
Now, if someone else were to use my module, and they want exactly the same behavior on multiple runs, e.g. for testing their code which builds on mine, they would not have the possibility to enforce that for my module short of editing my code. I think it would be better to offer such a user an API, such that in normal use a “fresh entropy” generator is created, but it is also possible to set a specific one. I believe that is what “passing around a generator” refers to in NEP 19.
Finally my question: Is there a recommended way to provide such an API? One possibility would have to have a consistently named module-global variable (e.g. __rng__), which could be read to use the same generator elsewhere, or set to force the module using another one.
I understand the concerns that lead to avoid having a “default global instance”. But the drawback is what I described above: for a program consisting of different pieces all using numpy.random, but from different authors, it is not easily possible to enforce a specific RNG generator seed.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:38 (30 by maintainers)

Top Related StackOverflow Question
When I refer to “passing around a generator”, I mean that it should be an explicit argument in all of the functions that use the generator. If your function calls another function that uses the generator, it should pass that along in the arguments. There should not be any globals, module-level or otherwise. This is the way that code from multiple authors work together.
scikit-learnandscipydo this very effectively using acheck_random_state()utility function (for the oldRandomStatesystem). You can usedefault_rng()more or less the same way.If I call
f(x, y), it will use aGeneratorwith fresh entropy. If I callf(x, y, rng=1234567890), it will use aGeneratorseeded with1234567890. If I callf(x, y, rng=some_generator_i_have_from_somehwere), it will use thatGeneratorinstance.We should also note that
RandomStateis the older API, and new code should be usingrng = Generator(PCG64(123456789)).