timeseries aggregate_downsample() appears to be significantly slower in Astropy 5
See original GitHub issueDescription
TimeSeries aggregate_downsample()
appears to be significantly slower in Astropy 5 (comparing it with Astropy 4).
For the sample TimeSeries below (20000 data points), doing a aggregate_downsample(ts, time_bin_size=10*u.minute)
, the elapsed time is roughtly:
- Astropy 4.3.1, Windows 11: < 1 sec
- Astropy 5, Windows 11: ~ 10 sec
- Astropy 5, Windows 11 with WSL2: ~ 7 sec
The astropy in question are installed with conda
(pip
for WSL2).
Steps to Reproduce
The sample time series and the timing script:
# Test timeseries aggregate_downsample's running time
from astropy.time import Time
from astropy.timeseries import TimeSeries
from astropy import units as u
import numpy as np
from timeit import default_timer as timer
# the scale of a typical TESS 2-minute cadence lightcurve
num_points = 20000
time = Time(2457000 + np.arange(0, num_points) / 24 / 60 * 2, format="jd")
flux = np.ones(num_points)
ts = TimeSeries(time=time, data=dict(flux=flux))
start = timer()
ts_b = astropy.timeseries.aggregate_downsample(ts, time_bin_size=10*u.minute)
end = timer()
print("ts.bin(10 min) elapsed time:", (end - start))
len(ts), len(ts_b)
System Details
Windows 11 + Astropy 4.3.1
Windows-10-10.0.22000-SP0
Python 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
astropy 4.3.1
Numpy 1.22.3
pyerfa 2.0.0.1
Scipy 1.8.0
Windows 11 + Astropy 5
Windows-10-10.0.22000-SP0
Python 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
astropy 5.0.4
Numpy 1.22.3
pyerfa 2.0.0.1
Scipy 1.8.0
Windows 11 WSL2 + Astropy 5
Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0]
astropy 5.0.2
Numpy 1.22.3
pyerfa 2.0.0.1
Scipy 1.8.0
Issue Analytics
- State:
- Created a year ago
- Comments:10 (9 by maintainers)
Top Results From Across the Web
aggregate_downsample(): much slower for non-Quantity ...
Timseries aggregate_downsample() : when it is downsampling a non- Quantity columns ( Column , NdarrayMixin ), it is noticeably slower.
Read more >Manipulation and Analysis of Time Series — Astropy v5.2
Resampling. We provide a aggregate_downsample() function that can be used to bin values from a time series into equal-size or uneven bins, and...
Read more >TimeSeries — Astropy v5.2
TimeSeries provides a class for representing time series as a collection of values of different quantities measured at specific points in time (for...
Read more >aggregate_downsample — Astropy v5.2
Downsample a time series by binning values into bins with a fixed size or custom sizes, using a single function to combine the...
Read more >Time Series (astropy.timeseries)
The time series classes presented below are QTable subclasses that have special columns to ... Therefore, much of the functionality described in Data...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Replacing
Time
with plain objects in the 2np.searchsorted()
calls cuts down the time considerably.The change made: With #11266, https://github.com/astropy/astropy/blob/110e2dd10d7738212f2d016dae5eb2aa996dea83/astropy/timeseries/downsample.py#L169-L174
pass plain array with
.value
tonp.searchsorted()
Degradation source confirmation: I ran the same test script against the commit before #11266 was merged (549404a80) and the commit after it was merged (110e2dd10). It demonstrates the 10+X slow down in elapsed time.
Before #11266:
After #11266: