BUG: mne.parallel crashes when too much data
See original GitHub issueI’m encountering some weird error when I try to decod too much data e.g.
import numpy as np
from mne import create_info, EpochsArray
from mne.decoding import TimeDecoding
# simulate data
n_trial, n_chan, n_time = 800, 300, 1445
info = create_info(n_chan, 128., 'mag')
epochs = EpochsArray(np.random.rand(n_trial, n_chan, n_time),
info=info,
events=np.zeros((n_trial, 3), int))
y = np.random.randint(0, 2, n_trial)
epochs._data[y==1, :, :] += 1
# fit
td = TimeDecoding(n_jobs=-1, predict_mode = 'mean-prediction')
td.fit(epochs, y=y)
works fine but
td.predict(epochs)
returns
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-22-274c297025ed> in <module>()
13 td = TimeDecoding(n_jobs=-1)
14 td.fit(epochs, y=y)
---> 15 y_pred = td.predict(epochs)
/home/ubuntu/mne-python/mne/decoding/time_gen.py in predict(self, epochs)
1339 """ # noqa
1340 self._prep_times()
-> 1341 super(TimeDecoding, self).predict(epochs)
1342 self._clean_times()
1343 return self.y_pred_
/home/ubuntu/mne-python/mne/decoding/time_gen.py in predict(self, epochs)
291 n_orig_epochs=n_orig_epochs, test_epochs=test_epochs,
292 **dict(zip(['X', 'train_times'], _chunk_data(X, chunk))))
--> 293 for chunk in chunks)
294
295 # Concatenate chunks across test time dimension.
/home/ubuntu/anaconda/lib/python2.7/site-packages/joblib/parallel.pyc in __call__(self, iterable)
808 # consumption.
809 self._iterating = False
--> 810 self.retrieve()
811 # Make sure that we get a last message telling us we are done
812 elapsed_time = time.time() - self._start_time
/home/ubuntu/anaconda/lib/python2.7/site-packages/joblib/parallel.pyc in retrieve(self)
725 job = self._jobs.pop(0)
726 try:
--> 727 self._output.extend(job.get())
728 except tuple(self.exceptions) as exception:
729 # Stop dispatching any new job in the async callback thread
/home/ubuntu/anaconda/lib/python2.7/multiprocessing/pool.pyc in get(self, timeout)
565 return self._value
566 else:
--> 567 raise self._value
568
569 def _set(self, i, obj):
SystemError: error return without exception set
or
IOError: bad message length
There is no error if I set n_jobs=1
, if I reduce the dimensionality of the data (either n_chan
, n_time
or n_trial
), and if I predict a subset: e.g.
td.predict(epochs[:596])
works fine, but from 597, it crashes
td.predict(epochs[:597])
I am running on an aws m4.4xlarge running with ubuntu anaconda and mne dev, joblib is 0.9.4.
I have no idea what to do…
Issue Analytics
- State:
- Created 7 years ago
- Comments:30 (29 by maintainers)
Top Results From Across the Web
Developers - BUG: mne.parallel crashes when too much data -
I'm encountering some weird error when I try to decod too much data e.g. import numpy as np from mne import create_info, EpochsArray...
Read more >Joblib crashes after 2x n_jobs - Stack Overflow
The interesting part is it always crashes after calling pullSummaryData exactly 2* n_jobs . If n_jobs=3 , pullSummaryData will be called 6 times ......
Read more >Parallels Access repeatedly crashes after Windows Update ...
Starts out ok and stays up for 1-3 hours then crashes. ... It's not the Windows updates so much as the Parallels Access...
Read more >CE-u1 Copy Parallel causes crash - MicroStation Forum
When you can ignore the crash, the copy parallel will have a distance of zero when the cursor is over the absolute origin...
Read more >OpenFoam parallel crashes at random - Forums - CFD Online
I had once case run successfully in parallel before. ... [0] Maximum number of iterations exceeded#0 Foam::error: rintStack(Foam::Ostream&) ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Retrospectively it would have been better to simply use joblib I think, just one module to attack on debugging.
On Thu, Apr 28, 2016 at 11:36 PM Jean-Rémi KING notifications@github.com wrote:
I think we probably won’t change our default behavior, since we usually use
np.array_split
hopefully the pickling overhead is small