question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RandomState_ctor accesses /dev/urandom, a significant bottleneck when copying RandomStates

See original GitHub issue

The RandomState_ctor function in numpy.random.init makes an call to construct a new RandomState object without an explicit seed. The unseeded call results in an access to /dev/urandom which is wildly expensive. Given that the purpose of this function to add in pickling, so that the internal state will be updated to reflect the state of the pickled RandomState, this expensive access actually don’t do anything. Because the state of the newly created RandomState is irrelevant (it’ll be updated by pickling), its safe to use the fast cached seed value of 0. An example of updating the function is below as well as a nosetest backed unit test that verifies behavior. In order to incorporate into master branch the code in init.py in random simply needs to have seed=0 added to it.

def __Fast_RandomState_ctor():
    """Patch the __RandomState_ctor initialization routine for performance

    numpy.random.__init__.__RandomState_ctor calls RandomState() directly, which
    interacts with the filesystem and kernel via /dev/urandom, causing a major performance
    bottleneck running many short parallel jobs.

    This function overrides that code and uses the seed=0 version to avoid the bad
    system call interaction.  As far as I can tell this is totally kosher, as this
    routine appears to be used for pickling only, and produces a random state object
    that will be initialized in some other way later.

    """
    return RandomState(seed=0)

numpy.random.__RandomState_ctor = __Fast_RandomState_ctor

Unit test to verify behavior works:

def test_copy_random_state():
    """Test that we can deep copy a random state properly

    Checks that our constructor optimization is safe
    """
    def state_eq(ran1, ran2):
        """Need to know too much about structure of RandomState.get_state()"""
        name1, state1 = ran1.get_state()[0:2]
        name2, state2 = ran2.get_state()[0:2]
        nose.tools.eq_(name1, name2)
        nose.tools.ok_((state1 == state2).all())

    def checkme(seed):
        """Ensure that RandomState created with seed can be duplicated properly"""
        if seed == 'no_arg':
            state = np.random.RandomState()
        else:
            state = np.random.RandomState(seed=seed)
        dup1 = copy.deepcopy(state)     # dup1 is the same as state
        state_eq(state, dup1)
        state_rands = (state.rand(), state.rand(), state.rand())
        dup1_rand1 = dup1.rand()        # dup1 is two behind state
        dup2 = copy.deepcopy(dup1)      # dup2 is == dup1, and two behind state
        state_eq(dup1, dup2)

        dup1_rands = (dup1_rand1, dup1.rand(), dup1.rand())
        dup2_rands = (dup1_rand1, dup2.rand(), dup2.rand())
        nose.tools.eq_(state_rands, dup1_rands, 'Original and immediate duplicate produce random numbers')
        nose.tools.eq_(state_rands, dup2_rands, 'Dup2 should be dup1 + 1 rand')

        # ensure that we're really not synchronized
        state.rand()
        for _ in range(100):
            nose.tools.assert_not_equal(state.rand(), dup1.rand())
            nose.tools.assert_not_equal(state.rand(), dup2.rand())

    nose.tools.eq_(np.random.__RandomState_ctor, __Fast_RandomState_ctor)

    yield checkme, 0
    yield checkme, None
    yield checkme, 'no_arg'
    yield checkme, [1, 2, 3]

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Comments:14 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
depristocommented, May 31, 2014

It is on virtualized machines on AWS, especially when running with multiple cores via multiprocessing. Each access is much more expensive, and you have many processes contending for the same system resource. I can confirm it’s much less significant of an effect directly on my macbook but on a c3.8xlarge on AWS we observe 90% of our CPU time is spent blocking on /dev/urandom syscall requests.

On Sat, May 31, 2014 at 12:13 PM, Julian Taylor notifications@github.com wrote:

urandom is the nonblocking random pool which is quite fast:

In [8]: %timeit copy.deepcopy(d) 1000 loops, best of 3: 603 µs per loop

is this really a bottleneck?

— Reply to this email directly or view it on GitHub https://github.com/numpy/numpy/issues/4763#issuecomment-44752225.

Mark A. DePristo mark@depristo.com

0reactions
depristocommented, Jun 2, 2014

Here you go:

https://github.com/numpy/numpy/pull/4768

Mark

On Mon, Jun 2, 2014 at 1:26 PM, Charles Harris notifications@github.com wrote:

A pep8 compliant pull request would be welcome.

— Reply to this email directly or view it on GitHub https://github.com/numpy/numpy/issues/4763#issuecomment-44866432.

Mark A. DePristo mark@depristo.com

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found