question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Analysing the performance of different methods to get windows

See original GitHub issue

I’ve started looking at the performance of various ways of getting windows: 1- MNE: epochs.get_data(ind)[0] with lazy loading (preload=False) 2- MNE: epochs.get_data(ind)[0] with eager loading (preload=True) 3- MNE: direct access to the internal numpy array with epochs._data[index] (requires eager loading) 4- HDF5: using h5py (lazy loading)

The script that I used to run the comparison is here: https://github.com/hubertjb/braindecode/blob/profiling-mne-epochs/test/others/profiling_mne_epochs.py Also, I ran the comparison on a single CPU using: >>> taskset -c 0 python profiling_mne_epochs.py

Here’s the resulting figure, where the x-axis is the number of time samples in the continuous recording: timing_results

For the moment, it looks like: 1- ._data[index] is unsurprisingly the fastest, however it requires to load the entire data into memory. 2- hdf5 is very close, with around 0.5 ms per loop, which is great knowing it’s able to only load one window at a time. 3- get_data(index) is much slower, but this is expected as we know it creates a new mne.Epochs object every time it’s called. Also, the gap between preload=True and preload=False is about 1.5 ms, which might be OK. The main issue though seems to be the linear increase of execution time as the continuous data gets bigger and bigger.

Next steps

Considering the benefits of using MNE for handling the EEG data inside the Dataset classes, I think it would be important to dive deeper into the inner workings of get_data() to see whether simple changes could make this more efficient. I can do some actual profiling on that. What do you think @agramfort @robintibor @gemeinl ?

Note: I haven’t included the extraction of labels in this test.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:25 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
robintiborcommented, Feb 10, 2020

Great @hubertjb . Seems we are getting to a reasonable training time range. Would also be interesting how big the difference is for Deep4. And as you said, maybe num_workers would already close the gap enough to consider it finished. I would say a gap of 1.5x for deep4 to me is acceptable.

1reaction
robintiborcommented, Jan 30, 2020

Cool, thanks for the clear info! Yes, diving a bit deeper may be helpful. Keep in mind: we will need fast access mainly during the training loop, so directly before returning some tensor/ndarray (in the usual case) that will be passed to the deep network. So for preload=True, accessing _data may be fine to me. The question is more the preload=False case, if this one can be fast enough in mne as well. So the relatively small gap for get_data there is encouraging for sure.

You could additionally do the following on reasonable GPU to know better what kind of times we may need to reach in the end: Forward one dummy batch size (64,22,1000) through the deep and shallow network, compute classification loss with dummy targets, and do the backward, measure the wall clock time (don’t use profilers here for now, they may not work well with GPU). Then we have a rough time we want to reach…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Windows Performance Analyzer | Microsoft Learn
is a tool that creates graphs and data tables of Event Tracing for Windows (ETW) events that are recorded by Windows Performance Recorder...
Read more >
Using Windows Performance Analyzer to analyze Modern ...
The Windows Performance Analyzer (WPA) displays traces of system activity in a graphical format.
Read more >
Evaluate Fast Startup Using Windows Performance Toolkit
Step 1: Open Fast Startup trace using WPA · Step 2: Open Fast Startup trace using WPA · Step 3: Visualize the activity...
Read more >
Windows Performance Toolkit | Microsoft Learn
The Windows Performance Toolkit consists of two independent tools: Windows Performance Recorder (WPR) and Windows Performance Analyzer (WPA). In ...
Read more >
Windows Performance Analyzer step-by-step guide
In this article. Step 1: Opening an ETL File; Step 2: Selecting Graphs; Step 3: Selecting a Time Interval; Step 4: Zooming in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found