Speed up trim_zeros
See original GitHub issuea = np.hstack([
np.zeros((100_000,)),
np.random.uniform(size=(100_000,)),
np.zeros((100_000,)),
])
trim_zeros(a)
Here the call to trim_zeros
takes about 50ms.
Looking at the implementation of trim_zeros
, it is implemented in the most obvious and unoptimized way imaginable (a for loop looking at each item separately).
I think there should be a warning in the documentation about the fact that it’s entirely unoptimized and may be horrendously slow, or we should strive to improve performance.
As an implementation idea to improve performance, I prototyped a “block-wise” trim function to be used before trim_zeros
:
def fast_trim_zeros(filt, trim='fb'):
filt = trim_zeros_block(filt, trim)
return np.trim_zeros(filt, trim)
def trim_zeros_block(filt, trim='fb', block_size=1024):
"""Trim blocks of zeros"""
trim = trim.upper()
first = 0
if 'F' in trim:
for i in range(0, len(filt), block_size):
if np.any(filt[i:i+block_size] != 0.):
first = i
break
last = len(filt)
if 'B' in trim:
for i in range(len(filt)-1, block_size - 1, -block_size):
if np.any(filt[i-block_size:i] != 0.):
last = i
break
return filt[first:last]
Speed of a call to fast_trim_zeros
is about 2ms, so roughly 25x as fast.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (8 by maintainers)
Top Results From Across the Web
Python - Fastest way to strip the trailing zeros from the bit ...
Is there a faster way? Perhaps another useful tidbit is num ^ (num - 1) which produces 0b1!<trailing zeros> where !< ...
Read more >How to Remove Leading Zeros in Excel (5 Easy Ways)
And in case the number is smaller, leading zeros are added to make up for it. ... So, the first step is to...
Read more >How do I trim leading zeros? - Trifacta Community
If I have a 00240 coming through in the data, but I need it to output as 240. Changing the data type to...
Read more >Trim Leading Zeros Function - SQL Server Helper
Trim Leading Zeros Function · Replace each 0 with a space – REPLACE([CustomerKey], '0', ' ') · Use the LTRIM string function to...
Read more >Remove Leading Zeros in Alteryx - Big Mountain Analytics
This post explains how to use the Trim function to remove varying numbers of leading zeros in Alteryx. Click the post to learn...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
How about converting the passed object into a boolean array and then use
np.argmax()
to find the first/last non-zero element? With your previously defined example array I’m seeing an increase in execution speed of ~2 orders of magnitude (398 µs versus 37 ms).Shall I create a pull request with the implementation as proposed above?