question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] - numpy overflow encountered in reduce

See original GitHub issue

Thanks for sharing this package, I’m loving it!

I did run into a bug today. When I try to run dist_plot on my dataset, I get the following message:

<snip>\numpy\core_methods.py:160: RuntimeWarning: overflow encountered in reduce

I isolated it down to one particular series in my dataframe. It’s not one I really care about, but maybe someone else will run into it for a series they DO care about. Here’s a describe() after running it through klib’s data_cleaning function:

df.created_at.describe() count 5.213400e+04 mean 1.610795e+12 std 4.225043e+08 min 1.609891e+12 25% 1.610552e+12 50% 1.610838e+12 75% 1.611198e+12 max 1.611274e+12 Name: created_at, dtype: float64

Meanwhile, info() reports something different:

df.info() … 2 created_at 52134 non-null float32 …

Notice one reports float32 while the other says float64… Seems fishy.

I’m using miniconda on Windows 10. conda v4.9.2 numpy v1.19.5 klib v0.1.0

If you need me to provide my dataset, I can do so.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Zalfrincommented, Mar 10, 2021

@akanz1 yep that makes sense. You’re right, the analysis of this particular variable is not interesting, so the overflow doesn’t bother me. But who knows, maybe some day someone will run into this with data that IS interesting. 😀 Thanks for taking a look.

0reactions
akanz1commented, Mar 10, 2021

@Zalfrin thanks for the data. I was able to reproduce the issue and narrow it down to the computation of the kurtosis using scipy. scipy.stats.kurtosis(df_cleaned)

If you check your plot created with klib.dist_plot(df_cleaned), you should notice that the kurtosis becomes infinite. I was not able to identify exactly why the calculation of the kurtosis results in a RuntimeWarning using the cleaned_df but not using the original df.

Given the overflow warning, i suspect that for the computation of the kurtosis (see here for scipy source) the 32bit float is not large enough to hold intermediary results, since the already large initial values (your UNIX timestamps) are squared.

Ideally, you convert your timestamps to a datetime using datetime.fromtimestamp(timestamp) to avoid the overflow. Or simply ignore the warning since the kurtosis likely does not add much value in this situation.

I hope this helps!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python RuntimeWarning: overflow encountered in long scalars
Here's an example which issues the same warning: import numpy as np np.seterr(all='warn') A = np.array([10]) a=A[-1] a**a. yields
Read more >
How to Fix: RuntimeWarning: Overflow encountered in exp
This warning occurs while using the NumPy library's exp() function upon using on a value that is too large. This function is used...
Read more >
numpy.seterr — NumPy v1.24 Manual
Division by zero: infinite result obtained from finite numbers. · Overflow: result too large to be expressed. · Underflow: result so close to...
Read more >
How to Fix: RuntimeWarning: overflow encountered in exp
This warning occurs when you use the NumPy exp function, but use a value that is too large for it to handle. It's...
Read more >
Rasterio error RuntimeWarning: overflow encountered in ...
However I am getting this error RuntimeWarning: overflow encountered in reduce return umr_sum(a, axis, dtype, out, keepdims, initial, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found