question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Suggestion: Sliding Window Function

See original GitHub issue

Using np.lib.stride_tricks.as_stride one can very efficiently create a sliding window that segments an array as a preprocessing step for vectorized applications. For example a moving average of a window length 3, stepsize 1:

a = numpy.arange(10)
a_strided = numpy.lib.stride_tricks.as_strided(
    a, shape=(8, 3), strides=(8, 8)
)
print numpy.mean(a_strided, axis=1)

This is very performant but very hard to do, as the shape and strides parameters are very hard to understand.

I suggest the implementation of a “simple sliding_window function” that does the job of figuring out those two parameters for you.

The implementation I have been using for years allows the above to be replaced by

a = numpy.arange(10)
a_strided = sliding_window(a, size=3, stepsize=1)
print numpy.mean(a_strided, axis=1)

@teoliphant also has an implementation that would change the above to

a = numpy.arange(10)
a_strided = array_for_sliding_window(a, 3)
print numpy.mean(a_strided, axis=1)

both making it much more readable.

Seeing it is a common usecase in vectorized computing I suggest we put a similar function into NumPy itself.

Regarding to which implementation to follow, they are both assume different things but allow you to do the same thing eventually:

sliding_window

  • slides over one axis only
  • allows setting windowsize and stepsize
  • returns array with dimension n+1
  • sliding over several axes requires two calls (which come for free as there is no memory reordered)
  • has a superfluous copy parameter that can be removed and replaced by appending .copy() after the call

array_for_sliding_window

  • slides over all axes simultaneously, window lengths are given as tuple parameter
  • Assumes a stepsize one in all directions
  • returns array with dimension n*2
  • stepsize not equal to one requires slicing of output data (unsure if this implies copying data)
  • Disabling sliding over axis[n] requires you set argument wshape[n] = 1 or wshape[n] = a.shape[n]

This means for flexible stepsize the following are equivalent (with some minor bug in sliding_window there):

a = numpy.arange(10)
print sliding_window(a, size=3, stepsize=2)

a = numpy.arange(10)
print array_for_sliding_window(a, 3)[::2, :] # Stepsize 2 by dropping every 2nd row

for sliding over one axis the following are equivalent (with some transposing and squeezing):

a = numpy.arange(25).reshape(5, 5)
print sliding_window(a, size=3, axis=1)

a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (1, 3))

and for sliding over two axis the following are equivalent:

a = numpy.arange(25).reshape(5, 5)
print sliding_window(sliding_window(a, size=3, axis=0), size=2, axis=1)

a = numpy.arange(25).reshape(5, 5)
print array_for_sliding_window(a, (3, 2))

This issue is about sparking discussion about

  • Do we need such a function?
  • Which features are required?
  • Which interface should we persue?

After discussion I am willing to draft up a pull request with an implementation we agreed on.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:43
  • Comments:23 (16 by maintainers)

github_iconTop GitHub Comments

10reactions
nils-wernercommented, Jun 21, 2016

Yes, you may be able to optimize an overlapping FFT that way. Again, I am not talking about a specific usecase here.

It pains me to say it but I am looking at things from the “rapid prototyping scientific code” side of things. I have yet to come across scientific code that actually does such optimizations and doesn’t just go down the easy “slice and vectorize” road.

When I am trying out some random idea I had I am not interested in prematurely optimizing my lapped operations, I just want it to be reasonably fast and maintainable for as little cost as possible.

  • A for loop gives me neither: The code is hard to read and it will be slow.
  • Your solution gives me one only: It may be screaming fast but is near impossible to rapidly iterate on or maintain by non-CS-engineers or let alone students.
  • A sliding window implementation removes most of the for loops and off-by-one errors and at the same time may provide a reasonable speedup.

When it goes to actual production code with more maintainers and tests and some inner loops done in C your implementation of course makes more sense.

4reactions
toddrjencommented, Aug 10, 2016

Considering that both scipy and matplotlib implement their spectrogram-related functions using exactly this approach rather than some more efficient approach, it seems we have already gone down this path.

I think there are two issues with using more efficient approaches. First, it requires different approaches for any calculation you might want to do. Second, it requires someone to actually implement all of these special cases. Considering that even in the supposedly obvious case with FFT no one has stepped up to do this, it doesn’t seem that this is happening.

I think the simplest approach would not be so much to create special functions for each calculation you might want, but rather create a single function that takes the window length, overlap, a possible window, and a function. It would then use strides to create the correct shape, then apply the window (if any), then apply the function. matplotlib and scipy are already doing the first two steps, so it would be easy to translate their code into a general function that could work with any function that takes a vector.

To be completely open, I implemented the strided approach in matplotlib, but before that it was using the same algorithm implemented using a loop, so this was an improvement over what existed before.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to apply Sliding Window function on the date field?
Window function will assign count to each row based on partition and order. If row is unique (we are comparing current row to...
Read more >
What is Sliding Window Algorithm? Examples? - Stack Overflow
While solving a geometry problem, I came across an approach called Sliding Window Algorithm. Couldn't really find any study material/details ...
Read more >
Sliding window correlation analysis: Modulating ... - NCBI
The sliding window correlation (SWC) analysis is a straightforward and common approach for evaluating dynamic functional connectivity.
Read more >
Sliding window function over column vector, Help! - MathWorks
I want to create a sliding window function over the column vector of magnitude to perform a calculation and store the output.
Read more >
SQL Window Functions vs. SQL Aggregate Functions
What's the difference between SQL window functions vs. ... Then, we use a window function with a sliding window frame to calculate the ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found