question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`numpy.load` slows `runcell`

See original GitHub issue

demo.mp4

Despite last command printing 0.0, runcell(3, ...) took several seconds. Strangely, shape = (2, ...) & dtype='float64', which is same size in memory, doesn’t yield this effect; haven’t tested with 'float32'. Also, IPython commands aren’t slowed.

Using spyder 4.2.1 as conda doesn’t have 4.2.2 yet; Win 10 x64, Python 3.7.9, numpy 1.19.2.


Code:

import numpy as np

X = np.random.randn(16, 16, 240, 24000).astype('float16')
np.save('arr.npy', X)

Restart kernel

import numpy as np
from time import time
out = np.load('arr.npy')

#%%
t0 = time()
t1 = time()

#%% 
print(t1 - t0)

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:21 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
impact27commented, Mar 15, 2021

Maybe we should add a message saying: “Computing min/max for the current variables took more than 2 seconds. Do you want to disable automatic variable explorer refreshing?”

1reaction
bcolsencommented, Mar 12, 2021

I can confirm that this is due to min max in the variable explorer. X takes 30 seconds to compute on my box and X.max() takes about 13 seconds. It seems that variable explorer is recomputing the min max of every variable in the explorer when it gets the call to refresh after runcell or runfile. If the variable explorer is busy when runcell or runfile is

@OverLordGoldDragon @sawtw thanks for helping get to the bottom of this! Here is a work around: Open variable explorer, right click on the table and uncheck “show arrays min/max” wait for a while for the explorer to refresh and then it should be speedy again.

Debugging

Issue 1: Varible Exporer updates every variable regardless of change

With one array in the variable explorer and min/max enabled, it takes about 26 seconds to refresh the variable explorer. This is regardless of whether the array has changed. Just just running t0 = time() in either IPython directly of through runcell it takes ~26 s for the variable explorer to update.

I don’t know if there is much to be done here. Perhaps we check the array size and don’t give min/max past a certain size.

Issue 2: Runcell and runfile seem to wait for comms from variable explorer.

The issue for this bug is that runcell gets stuck in comms if variable explorer is already updating. If you use runcell and variable explorer is finished updating, runcell work just fine. I only gets stuck when you are running one cell right after the other. The first cell you run will return normally while the next cell waits until the update in the variable explorer to from the previous cell is finished.

Does the variable explorer need to respond to runcell or is it good enough to just assume it will update.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using Numpy.Load - is this the fastest method? It seems slow
I am loading 10-15MB numpy arrays saved in .npy format in a loop, and each load takes about 1.5s in Google Colab. Is...
Read more >
Built-in magic commands — IPython 8.7.0 documentation
Load numpy and matplotlib to work interactively. This function lets you activate pylab (matplotlib, numpy and interactive support) at any point during an ......
Read more >
numpy.load — NumPy v1.24 Manual
Loading files that contain object arrays uses the pickle module, which is not secure against erroneous or maliciously constructed data.
Read more >
Loading NumPy arrays from disk: mmap() vs. Zarr/HDF5
Learn how to load larger-than-memory NumPy arrays from disk using ... for the much slower write to the disk, you're only writing to...
Read more >
Is Your Python For-loop Slow? Use NumPy Instead.
NumPy arrays are optimized for speed because of homogeneous and densely packed elements and implementations in C. ... Thanks for reading, friend!
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found