question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Repeatedly calling scipy.signal.resample uses more memory than expected

See original GitHub issue

I have code that resamples many audio files using scipy.signal.resample. The code uses much more memory than I would expect to the point of freezing the OS. I’ve managed to simplify my code to a simple test case. This uses ~8Gb RAM after about 6 iterations but I would expect it should use much less memory than this.

I am using inputs of varying sizes, which seems to be significant. If I use an audio input of constant size there doesn’t seem to be a problem.

from scipy.signal import resample
import numpy as np

def Test():

	shps = range(11473744, 1000, -2)

	for i, shp in enumerate(shps):
		signal = np.random.normal(size=(shp, 1))

		print (i, signal.shape)
		targetSamples = signal.shape[0]*44100//48000
		signal2 = resample(signal, targetSamples, axis=0)
		print (signal2.shape)
		del signal, signal2

if __name__=="__main__":
	Test()

I might have missed something obvious, so this might not even be a bug!

  • Scipy 1.4.1
  • Numpy 1.18.4
  • Python sys.version_info(major=3, minor=6, micro=9, releaselevel=‘final’, serial=0)

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:14 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
mreineckcommented, Jun 2, 2020

I’m not absolutely sure, but if the resampling process somehow calls scipy.fft internally, it might be caused by the module’s internal plan caching. Currently scipy.fft caches plans for the last 16 different transforms it did, and for 1D transforms they are comparable in size to the transformed arrays (for “difficult” transform sizes even larger).

One way to test this hypothesis is to start your test with a smaller size and check whether the memory consumption still increases after 16 iterations.

0reactions
peterbell10commented, Jul 9, 2020

Indeed, it is about 1/3 speedup now that I test it again. In that case, caching the factors seperately is probably not a good idea since it means the bluestein plans won’t be evicted from cache by regular sized plans, so will stick around for longer.

Read more comments on GitHub >

github_iconTop Results From Across the Web

scipy.signal.resample — SciPy v1.9.3 Manual
Resample x to num samples using Fourier method along the given axis. The resampled signal starts at the same value as x but...
Read more >
resampled time using scipy.signal.resample - Stack Overflow
I tried a few of the interpolation techniques implemented in numpy. The time aspect works, but the signal constructed by .resample is much...
Read more >
mne.io.RawArray — MNE 1.3.dev0 documentation
The object has to have the data loaded e.g. with preload=True or self.load_data() . Note. If n_jobs > 1, more memory is required...
Read more >
Transforms — MONAI 1.0.1 Documentation
storing too much information in data may cause some memory issue or IPC sync ... If some transform takes a data item dictionary...
Read more >
cuSignal API Reference - RAPIDS Docs
The Fourier Transform is used to perform the convolution by calling fftconvolve ... The resampled signal starts at the same value as x...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found