Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pvlib.iam.marion_integrate uses too much memory for vector inputs

See original GitHub issue

pvlib.iam.marion_integrate (which is mostly relevant as a helper for pvlib.iam.marion_diffuse) needs quite a bit of memory when passed vector inputs. An input of length 1000 allocates around 2GB of memory on my machine, so naively passing in a standard 8760 would use roughly 17-18 GB. Unfortunately I was very much focused on fixed tilt simulations when I wrote pvlib’s implementation and never tried it out on large vector inputs, so this problem went unnoticed until @spaneja pointed it out to me.

I think any vectorized implementation of this algorithm is going to be rather memory-heavy, so I’m skeptical that achieving even a factor of 10 reduction in memory usage is possible here without completely changing the approach (and likely shifting the burden from memory to CPU). However, here are two low-hanging fruits worth considering:

The current implementation has a handful of large 2-D arrays local to the function that only get released when the function returns. Some of them are only used near the beginning of the function but still take up memory for the entire function duration. Using the del statement to instruct python that those arrays are no longer needed allows python to reclaim that memory immediately and recycle it for subsequent allocations. This is probably a simplification of what actually happens, but it seems consistent with the below observations.
np.float32 cuts memory usage in half compared with np.float64 and (probably) doesn’t meaningfully change the result. It’s not like surface_tilt has more than a few sig figs anyway.

Here is a rough memory and timing comparison (using memory_profiler, very handy). pvlib is the current implementation; the two del variants use a strategic sprinkling of del but are otherwise not much different from pvlib. This is for an input of length 1000. The traces here are memory usage sampled at short intervals across a single function invocation; for example the blue pvlib trace shows that the function call took 1.4 seconds to complete and had a peak memory usage slightly higher than 2GB.

So using a few dels cuts peak memory usage roughly in half. Dropping down to np.float32 cuts it roughly in half again (and gives a nontrivial speedup too). It’s possible that further improvements can be had with other tricks (e.g. using the out parameter that some numpy functions provide) but I’ve not yet explored them.

My main question: are we open to using these two strategies in pvlib? Despite being built into python itself, del still seems unpythonic to me for some reason. Switching away from float64 is objectionable to the extent that it’s the standard in scientific computing and is therefore baked into the models by assumption. I think I’m cautiously open to both of the above approaches, iff they are accompanied by good explanatory comments and switching to float32 can be reasonably shown to not introduce a meaningful difference in output.

Remark: even ignoring this memory bloat, I tend to think that applying marion_integrate directly to an 8760 is a bit strange. In simulations with time series surface_tilts, a better approach IMHO is to calculate the IAM values only for np.linspace(0, 90, 1) or similar and use pvlib.iam.interp to generate the 8760 IAM series. If nothing else, we might suggest that in the docs.

Issue Analytics

State:
Created 2 years ago
Comments:10 (10 by maintainers)

Top GitHub Comments

1reaction

adriessecommented, Feb 16, 2022

I don’t oppose anything above. Other options:

set a smaller value for num
implement the polynomial approximations given in Marion’s paper
just use 0.95 for sky and 0.95 * iam(90-tilt) for horizon and ground since the impact on Gpoa is pretty small anyway

1reaction

wholmgrencommented, Feb 14, 2022

Not opposed to del but would the numpy out kwarg help?

Top Results From Across the Web

std::vector increasing peak memory - c++ - Stack Overflow

I get the impression it will always be a fixed size. So why not use a std::array instead? A std::vector always allocates more...

Vector in C++ STL - GeeksforGeeks

Vector elements are placed in contiguous storage so that they can be accessed and traversed using iterators. In vectors, data is inserted at...

C++ Tutorial: A Beginner's Guide to std::vector, Part 1

A C++ programming tutorial that teaches developers how to use std::vector. ... To hold this array, vector will allocate some memory, mostly more...

6 Tips to supercharge C++11 vector performance

Then we'll check using the capacity() function to tell us how many elements the container can hold in the memory it has already...

pvlib.iam.marion_integrate

Integrate an incidence angle modifier (IAM) function over solid angle to determine a diffuse irradiance correction factor using Marion's method. This lower- ...