question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mac/Linux with multiprocessing, all workers are seeded the same random state

See original GitHub issue

Reproducing code example:

Full code is here, I will leave this branch untouched so you can see the behavior I’m talking about:

https://github.com/FlorinAndrei/nsphere/tree/numpy-mp

On Mac or Linux, edit xpu_workers.py and comment out the rseed lines, and the bug will be triggered.

You can tell the bug has been triggered because there are very few dots in the Monte Carlo simulation graph, in the Jupyter notebook. There are supposedly 100 dots there, but due to the bug there are far fewer - and the whole population is far less random, which affects the app as a whole.


What’s really going on:

I create a pool of workers with:

import multiprocessing
from multiprocessing import Pool

p = Pool(processes = num_p)        
arglist = [(points, d, num_p, sysmem, gpumem, pointloops)] * num_p
work_out = p.map(make_dots, arglist)

And within the worker I have something like this:

pts = np.random.random_sample((points, d)) - 0.5

Parts of the pts array are returned as samples from all workers to the master process, and are collated in the work_out matrix. Each worker is supposed to make random samples - and of course the expectation is that each sample is different. https://dilbert.com/strip/2001-10-25

On Windows this works great.

On Mac and Linux, all pts arrays are generated with the exact same “random” content. The samples from workers are all identical. Within each sample the content looks random enough (just an eyeball estimate) but all samples coincide perfectly with each other.

It’s a very frustrating bug, hard to figure out the cause, and makes the code misbehave in weird ways.

I have to do this in each worker to get rid of the bug:

rseed = random.randint(0, 4294967296)
xp.random.seed(rseed)

Numpy/Python version information:

1.16.4 3.7.4 (default, Jul  9 2019, 18:13:23) 
[Clang 10.0.1 (clang-1001.0.46.4)]

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
mattipcommented, Oct 16, 2019

Without diving too deeply into your code, I wonder if you have seen the new (as of 1.17) random.BitGenerator api? In particular, you might be interested in the work done to ensure parallel processes get “independent” streams. Please let us know if we could improve the documentation to make it clearer, and if it helps solve your problem.

0reactions
mattipcommented, Nov 4, 2019

Closing. Thanks for the update. Hopefully you will try the new API.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Same output in different workers in multiprocessing
I think you'll need to re-seed the random number generator using numpy.random.seed in your do_calculation function.
Read more >
Random seed is replication across child processes #9650
When spawning child processes using the multiprocessing module, it appears that all child processes share the parent's random seed.
Read more >
multiprocessing and seeded RNGs - boris babenko, phd
one thing i like to do whenever i use a random number generator is to explicitly set the seed. this ensures that my...
Read more >
Random state within joblib.Parallel - Read the Docs
Technically, the reason is that all forked Python processes share the same exact random seed. As a result, we obtain twice the same...
Read more >
Why does "numpy.random.rand " produce the same values in ...
uniform can work just ok. Here is the code. import multiprocessing import numpy as np import random def print_map(_): print(np.random.rand( ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found