question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

timeseries aggregate_downsample() appears to be significantly slower in Astropy 5

See original GitHub issue

Description

TimeSeries aggregate_downsample() appears to be significantly slower in Astropy 5 (comparing it with Astropy 4).

For the sample TimeSeries below (20000 data points), doing a aggregate_downsample(ts, time_bin_size=10*u.minute) , the elapsed time is roughtly:

  • Astropy 4.3.1, Windows 11: < 1 sec
  • Astropy 5, Windows 11: ~ 10 sec
  • Astropy 5, Windows 11 with WSL2: ~ 7 sec

The astropy in question are installed with conda (pip for WSL2).

Steps to Reproduce

The sample time series and the timing script:

# Test timeseries aggregate_downsample's running time

from astropy.time import Time
from astropy.timeseries import TimeSeries
from astropy import units as u
import numpy as np
from timeit import default_timer as timer

# the scale of a typical TESS 2-minute cadence lightcurve
num_points = 20000
time = Time(2457000 + np.arange(0, num_points) / 24 / 60 * 2, format="jd")
flux = np.ones(num_points)

ts = TimeSeries(time=time, data=dict(flux=flux))

start = timer()
ts_b = astropy.timeseries.aggregate_downsample(ts, time_bin_size=10*u.minute)
end = timer()

print("ts.bin(10 min) elapsed time:", (end - start))
len(ts), len(ts_b)

System Details

Windows 11 + Astropy 4.3.1

Windows-10-10.0.22000-SP0
Python  3.9.10 | packaged by conda-forge | (main, Feb  1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
astropy  4.3.1
Numpy  1.22.3
pyerfa  2.0.0.1
Scipy  1.8.0

Windows 11 + Astropy 5

Windows-10-10.0.22000-SP0
Python  3.9.10 | packaged by conda-forge | (main, Feb  1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
astropy  5.0.4
Numpy  1.22.3
pyerfa  2.0.0.1
Scipy  1.8.0

Windows 11 WSL2 + Astropy 5

Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
Python  3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0]
astropy  5.0.2
Numpy  1.22.3
pyerfa  2.0.0.1
Scipy  1.8.0

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
orionleecommented, Apr 5, 2022

Replacing Time with plain objects in the 2 np.searchsorted() calls cuts down the time considerably.

  • It cuts down the time to run the test from ~10sec to ~1.5sec
  • still noticeably slower than the code before the PR (~0.5 sec).

The change made: With #11266, https://github.com/astropy/astropy/blob/110e2dd10d7738212f2d016dae5eb2aa996dea83/astropy/timeseries/downsample.py#L169-L174

pass plain array with .value to np.searchsorted()

    indices = np.searchsorted(bin_end.value, ts_sorted.time[keep].value)
    # For time == bin_start[i+1] == bin_end[i], let bin_start takes precedence
    if np.all(bin_start[1:] >= bin_end[:-1]):
        indices_start = np.searchsorted(ts_sorted.time[keep].value,
                                        [i.value for i in np.minimum(bin_start, ts_sorted.time[-1])])
1reaction
orionleecommented, Apr 5, 2022
  • I can confirm PR #11266 does lead to the performance degradation. See below.
  • It’d be great if someone could run the same test on non-Windows platform. to confirm if the degradation happens only on Windows.

Degradation source confirmation: I ran the same test script against the commit before #11266 was merged (549404a80) and the commit after it was merged (110e2dd10). It demonstrates the 10+X slow down in elapsed time.

Before #11266:

(astropy_RO) (astropy_RO) /c/dev/_astropy_RO ((549404a80...))
$python test_binning_time.py
Windows-10-10.0.22000-SP0
Python 3.9.10 | packaged by conda-forge | (main, Feb  1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
Numpy 1.22.3
pyerfa 2.0.0.1
astropy 5.0.dev937+g549404a80.d20220405
ts.bin(10 min) elapsed time: 0.5385730999999998
20000 4000

After #11266:

(astropy_RO) (astropy_RO) /c/dev/_astropy_RO ((110e2dd10...))
$python test_binning_time.py
Windows-10-10.0.22000-SP0
Python 3.9.10 | packaged by conda-forge | (main, Feb  1 2022, 21:21:54) [MSC v.1929 64 bit (AMD64)]
Numpy 1.22.3
pyerfa 2.0.0.1
astropy 5.0.dev943+g110e2dd10.d20220405
ts.bin(10 min) elapsed time: 10.4164447
20000 4000
Read more comments on GitHub >

github_iconTop Results From Across the Web

aggregate_downsample(): much slower for non-Quantity ...
Timseries aggregate_downsample() : when it is downsampling a non- Quantity columns ( Column , NdarrayMixin ), it is noticeably slower.
Read more >
Manipulation and Analysis of Time Series — Astropy v5.2
Resampling. We provide a aggregate_downsample() function that can be used to bin values from a time series into equal-size or uneven bins, and...
Read more >
TimeSeries — Astropy v5.2
TimeSeries provides a class for representing time series as a collection of values of different quantities measured at specific points in time (for...
Read more >
aggregate_downsample — Astropy v5.2
Downsample a time series by binning values into bins with a fixed size or custom sizes, using a single function to combine the...
Read more >
Time Series (astropy.timeseries)
The time series classes presented below are QTable subclasses that have special columns to ... Therefore, much of the functionality described in Data...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found