question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DOC: better document the spawn interface, compare and contrast it to Jax's "split"

See original GitHub issue

Jax, a new deep learning machine learning library has copied numpy’s excellent interface. They added one innovation to random number generation: splitting.

This is useful when parallel processes that should not be serialized need access to random numbers from a reproducible stream.

I suggest adding a method to BitGenerator:

b: BitGenerator
b.split(n)  # returns a list of n BitGenerators, each differently initialized from b in a unique, reproducible way

and a similar method on Generator that splits the underlying bit-generator and returns

[type(self)(big_generator[i]) for range(i)]

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
rkerncommented, Feb 27, 2020

While the current API forces you to work with SeedSequences directly to access its spawn() method, we did make sure that it would be possible to expose spawn() methods on BitGenerator and Generator, which will be more convenient. We wanted to wait until we got a little more experience with the concept before exposing it so prominently.

Just to provide some mathematical background, Jax’s PRNG is in the same weak-crypto family as our Philox BitGenerator. The method by which it splits only works well for that family of weak-crypto PRNGs because that family keeps its initial seed around as the key value and only evolves a counter as one draws numbers from it. The other PRNGs iterate the state.

SeedSequence implements a similar scheme, but separates that scheme out from the PRNG implementation. By keeping the original seed state around as SeedSequence and by implementing good integer hashing techniques, we can get the same benefits without needing the full weak-crypto functionality of the PRNG. This may be of particular interest to the Jax developers since the ThreeFry algorithm that they use is not the fastest, nor necessarily the most GPU-friendly (once the massive-parallelization playing field is leveled by SeedSequence). They might be able to use a faster PRNG like SFC64 (our default of PCG64 may not be appropriate on a GPU because it requires 128-bit multiplication).

1reaction
peteroupccommented, Feb 27, 2020

Note that Philox, and the Threefry PRNG, and SFC are merely examples of a general construction called the “counter-based PRNG” (as used in the “Random123” paper). In general, counter-based PRNGs use an underlying hash function or block cipher to hash a seed and an incrementing counter. And any other hash function can substitute for Threefry (or whatever underlying function the counter-based PRNG uses) as long as the resulting PRNG provides adequate randomness.

Also, splittable PRNGs are far from being an “innovation” of JAX (see also the JAX PRNG design notes); they have existed, for example, in Haskell and Java for years. Some of the known constructions for splittable PRNGs are surveyed in “Evaluation of Splittable Pseudo-Random Generators” by H. G. Schaathun, 2015. Some of them are general enough to be used by any PRNG (including Mersenne Twister and PCG), but do not necessarily lead to high-quality splittable PRNGs. Another example of a splittable PRNG is found in JuliaLang/julia#34852. See also https://github.com/idontgetoutmuch/random/issues/7.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sitemap-questions-24.xml
... /questions/154707/what-is-the-best-way-to-store-media-files-on-a-database ... /4131028/got-undefined-method-split-when-installing-gem-dm-mysql-adapter ...
Read more >
Jersey 2.22 User Guide - GitHub Pages
This is user guide for Jersey 2.22. We are trying to keep it up to date as we add new features. When reading...
Read more >
Reference Documentation
Developing software applications is hard enough even with good tools and technologies. Implementing applications using platforms which promise everything ...
Read more >
Fix list for IBM WebSphere Application Server V8.5
IBM WebSphere Application Server provides periodic fixes for the base and Network Deployment editions of release V8.5. The following is a complete listing ......
Read more >
Oracle1® VM VirtualBox1® Programming Guide and Reference
opment Kit (SDK) contains all the documentation and interface files that are ... VirtualBox OOWS for JAX-WS, which we will explain in more...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found