question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PERF: regression in getattr for IntervalIndex

See original GitHub issue

Master:

In [14]: idx = pd.interval_range(0, 1000, 1000) 

In [15]: %timeit getattr(idx, '_ndarray_values', idx)
1.29 ms ± 30.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [16]: %timeit idx.closed
321 ns ± 2.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

while on 0.25.3:

In [13]: idx = pd.interval_range(0, 1000, 1000) 

In [14]: %timeit getattr(idx, '_ndarray_values', idx)  
90.5 ns ± 2.09 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [15]: %timeit idx.closed 
105 ns ± 1.61 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

(just checked a few attributes, didn’t check if it is related to those specific ones or getattr in general)

I think this is a cause / one of the causes of several regressions that can currently be seen at https://pandas.pydata.org/speed/pandas/ (eg https://pandas.pydata.org/speed/pandas/#reshape.Cut.time_cut_timedelta?p-bins=1000&commits=6efc2379-b9de33e3)

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:19 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Jan 28, 2020

BTW, I don’t think this should necessarily be a blocker for 1.0. But we should keep this in mind for all index->EA delegation refactor that is happening, and look into improving this.

0reactions
simonjayhawkinscommented, Oct 29, 2020

moved off 1.1.4 milestone (scheduled for release tomorrow) as no PRs to fix in the pipeline

Read more comments on GitHub >

github_iconTop Results From Across the Web

PERF: performance regression in 1.0 compared to 0.25 #30790
I ran a full benchmark on a separate machine locally, comparing current master against 0.25.3. Some identified cases: Indexing slowdown due ...
Read more >
powerful Python data analysis toolkit - pandas
Fixed Regressions . ... 1.10.1.12 IntervalIndex . ... Improved performance of getattr(Series, attr) when the Series has certain index types.
Read more >
pandas documentação - Python - 10 - Passei Direto
What's New pandas: powerful Python data analysis toolkit, Release 0.23.4 • Improved performance of IntervalIndex.symmetric_difference() (GH18475) • Improved ...
Read more >
GiveMeSomeCredit-EDA, LOGISTIC REGRESSION, WOE
This method is commonly used alongside Logistic Regression for modelling ... by zero encountered in log result = getattr(ufunc, method)(*inputs, **kwargs).
Read more >
Advanced Python
__getattr__, object.attr read: attribute may not exist ... TimedeltaIndex, PeriodIndex, IntervalIndex (datetime-related indices) CategoricalIndex (index ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found