question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

improve rdtools import time

See original GitHub issue

RdTools is a little slow to import:

In [1]: %time import rdtools
Wall time: 1.04 s

For comparison:

In [1]: %time import pvlib
Wall time: 184 ms

Here’s a breakdown of where import rdtools is spending time:

image

Generated with:

(base) C:\Windows\Temp>python -X importtime -c "import rdtools" 2> rdtools.log
(base) C:\Windows\Temp>tuna rdtools.log

We could speed up the import time by changing how we import packages. I notice that statsmodels.api, pkg_resources, and h5py together make up a large chunk of the total but aren’t actually used in the “primary” RdTools functions. What do people think about moving those imports into the functions that need them instead of importing at module scope?

Pros:

  • It decreases import time to 600 ms (~40% speedup) on my machine
  • It makes those dependencies optional for people that don’t plan on using those functions – e.g. our fleets pipeline doesn’t actually need the statsmodels package because it doesn’t use the OLS and classical decomp functions.

Cons:

  • It’s nice to have a list of imports at the top of the module and hiding them in the functions is nonstandard and reduces code clarity
  • It violates pep8 (https://www.python.org/dev/peps/pep-0008/#imports)
  • It makes the first function invocation slower, and subsequent invocations very marginally slower
  • 1 second to import isn’t that big a deal

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
cwhansecommented, Mar 17, 2020

Good point, though that’s an actual change instead of just minor restructuring. Maybe np.polyfit?

My $0.02: unless importing statsmodels becomes a performance headache, I’d stick with it for the long term. There are regression functions in numpy (numpy.linalg.lstsq) but my sense is that the long-term intent is to provide regression and similar functions in statsmodels and scipy, with numpy providing the building blocks. And statsmodels provides options such as robust regression that may become desireable for RdTools applications.

0reactions
mdecegliecommented, Mar 5, 2021

This is almost a year old and the marginal gains probably don’t justify violating style guides to put imports in functions. I’ll close this for now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RdTools Overview — RdTools 2.1.4+0.g996f843.dirty ...
RdTools is an open-source library to support reproducible technical analysis of time series data from photovoltaic energy systems.
Read more >
Release 2.0.1+0.gc6fd05f.dirty
Inputs: Pandas time series of raw data to be filtered. Output: Boolean mask where `True` ... 7.1.1 0: Import and preliminary calculations.
Read more >
RdTools 2.0.6+0.gb6dcdd2.dirty documentation - Read the Docs
RdTools can handle both high frequency (hourly or better) or low ... Import and preliminary calculations; Normalize data using a performance metric ...
Read more >
Degradation and soiling example with clearsky workflow
The first step of the rdtools workflow is normalization, which requires a time series of energy yield, a time series of cell temperature,...
Read more >
TrendAnalysis object-oriented example
Import and preliminary calculations: In this step data is important and ... 2 }) # Register time series plotting in pandas > 1.0...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found