Speed up get_formatted_array
See original GitHub issueProblem description
Using get_formatted_array
splits loc::techs
and loc::tech::carriers
string sets and interacts between xarray and pandas to produce a sparse matrix for easier indexing (e.g. summing over a single tech
).
This can take a very long time for large DataArrays, and has been recorded as hitting memory limits for some devices.
So, it should be made more efficient. This could be a matter of defning loc::techs
etc. as tuples instead of ::
concatenated strings. Then they automagically are parsed as a MultiIndex, instead of needing to apply string operations.
Calliope version
0.6.3
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
python - fast formatted file output of numpy array
(fmt*len(a)).format(*a.tolist()) is a faster way of formatting the whole array. – hpaulj. Dec 12, 2019 at 17:31.
Read more >Improving I/O Performance
To eliminate unnecessary overhead, write whole arrays or strings at one time rather than individual elements at multiple times. Each item in an...
Read more >Speed Up Array Comparisons in Powershell with a ...
Summary: Learn how to speed up array comparisons in Windows PowerShell by using a runtime regular expression. Hey, Scripting Guy! Question.
Read more >How to find elements in an array faster / without using for ...
I have the following working code with a for loop but I want to make the process faster. For the sizes of arrays...
Read more >Slow Google Sheets? Here are 27 Ideas to Try Today
How can you speed up a slow Google Sheet? ... array notation, for example in this formula which gets the first 15,000 rows...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@timtroendle, I actually just switched off postprocessing on the cluster in my runs, due to the same issue… Anyway, I had some stuff waiting to go on this, see PR #231 for a working branch that you could test with. It may still blow up on unstacking the MultiIndex (but my memory profiling suggests a much lower memory use than the previous incarnation of
get_formatted_array
.I’m a but confused by your solution, how does it go from
("loctechscarriers", data_var_df.index)
to being possible to select a location using("loctechscarriers", data_var_df.index)
? If it offers an even better solution, I’m happy to look at updating the PR in line with it.Some ideas in here for further improvements of how we deal with arrays…