question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Large overhead in some random functions

See original GitHub issue

This issue is extracted from gitter (by @andyfaff – I saw it only randomly, so creating this before we forget).

Some random generator functions such as random() have a huge overhead compared to their legacy RandomState versions:

rs = np.random.RandomState()
rg = np.random.default_rng()

%timeit rs.random()                                       
# 432 ns ± 9.18 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit rg.random()                                       
# 5.19 µs ± 61.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The reason for this is the support of the dtype= keyword argument. A secondary reason (and maybe second speed issue) is that np.dtype.name is a very slow operations (it could plausibly be cached).

However, the solution here will be to simply avoid the whole np.dtype(dtype).name construct as much as possible and maybe adding a fastpath for the case when dtype is not used. np.dtype(dtype).type is np.float64 may be a solution, or dtype is np.float64 or np.dtype(dtype) is np.float64 to speed up the default branch.


Checking that we have benchmarks – or adding simple small array ones in a first commit – for this would be good when this is fixed.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
eric-wiesercommented, Jan 28, 2020

avoid the whole np.dtype(dtype).name construct as much as possible

Agreed, hopefully the slowness is enough to motivate people not to use it 😉

0reactions
przembcommented, Feb 5, 2020

@mattip Thank you so much 😃 I tried to follow mentioned comments, in case of any potential improvements please let me know. PR: https://github.com/numpy/numpy/pull/15511

Read more comments on GitHub >

github_iconTop Results From Across the Web

102037: CPU overhead from inlists much larger in 8.0.22
Bug #102037, CPU overhead from inlists much larger in 8.0.22 ... The "random-points" test uses a large inlist with 100 and 1000 entries ......
Read more >
Minimizing overhead with parallel functions in R [closed]
To limit overhead for moderately large N, it is almost always better to use mc.preschedule = TRUE (i.e. split the work in as...
Read more >
Overhead (computing) - Wikipedia
In computer science, overhead is any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to perform ......
Read more >
GWP-ASan: Sampling heap memory error detection in-the-wild
The allocator limits itself to a fixed amount of memory to control memory overhead and samples allocation to the debug allocator to reduce...
Read more >
Finding and Understanding Bugs in C Compilers - CS @ Utah
generates programs that cover a large subset of C while avoiding the ... generating the types of parameters to a new function). Random....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found