question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DOC: Recommended best practices for passing around a random bit generator

See original GitHub issue

If I develop a module that uses random numbers from numpy.random, and follow the recommendation to use numpy.random.default_rng() to create a generator, e.g. in the module’s initialization code, then every time the module is freshly imported I get a new sequence of random numbers.

Now it is often useful, for example for running exactly the same simulation several times, to initialize the generator to a specific state, and this creation function provides that possibility. I can do that for testing purposes during development, but in the final version I would of course use the form in which “fresh entropy” is used.

Now, if someone else were to use my module, and they want exactly the same behavior on multiple runs, e.g. for testing their code which builds on mine, they would not have the possibility to enforce that for my module short of editing my code. I think it would be better to offer such a user an API, such that in normal use a “fresh entropy” generator is created, but it is also possible to set a specific one. I believe that is what “passing around a generator” refers to in NEP 19.

Finally my question: Is there a recommended way to provide such an API? One possibility would have to have a consistently named module-global variable (e.g. __rng__), which could be read to use the same generator elsewhere, or set to force the module using another one.

I understand the concerns that lead to avoid having a “default global instance”. But the drawback is what I described above: for a program consisting of different pieces all using numpy.random, but from different authors, it is not easily possible to enforce a specific RNG generator seed.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:2
  • Comments:38 (30 by maintainers)

github_iconTop GitHub Comments

5reactions
rkerncommented, Feb 15, 2020

When I refer to “passing around a generator”, I mean that it should be an explicit argument in all of the functions that use the generator. If your function calls another function that uses the generator, it should pass that along in the arguments. There should not be any globals, module-level or otherwise. This is the way that code from multiple authors work together.

scikit-learn and scipy do this very effectively using a check_random_state() utility function (for the old RandomState system). You can use default_rng() more or less the same way.

def f(x, y, rng=None):
    rng = np.random.default_rng(rng)
    z = rng.normal(x, y)
    return g(z, rng)

def g(z, rng=None):
    rng = np.random.default_rng(rng)
    return z + rng.uniform(0, 10)

If I call f(x, y), it will use a Generator with fresh entropy. If I call f(x, y, rng=1234567890), it will use a Generator seeded with 1234567890. If I call f(x, y, rng=some_generator_i_have_from_somehwere), it will use that Generator instance.

1reaction
mattipcommented, Jun 26, 2021

We should also note that RandomState is the older API, and new code should be using rng = Generator(PCG64(123456789)).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Good practices with numpy random number generators
The preferred best practice for getting reproducible pseudorandom numbers is to instantiate a generator object with a seed and pass it around.
Read more >
RFC 4086 - Randomness Requirements for Security
This Best Current Practice document describes techniques for producing random quantities that will be resistant to attack. It recommends that future systems ...
Read more >
The fastest conventional random number generator that ...
The fastest conventional random number generator that can pass Big Crush? In software, we sometimes want to generate (pseudo-)random numbers.
Read more >
Cryptographically Secure Pseudo-Random Number ...
It's most secure to rely on upon OS-specific implementations to provide seeding. Providing a low-entropy predictable source could easily lead to ...
Read more >
Random Bit Generation | CSRC
Cryptography and security applications make extensive use of random numbers and random bits. However, constructing random bit generators and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found