Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: mne.parallel crashes when too much data

See original GitHub issue

I’m encountering some weird error when I try to decod too much data e.g.

import numpy as np
from mne import create_info, EpochsArray
from mne.decoding import TimeDecoding

# simulate data
n_trial, n_chan, n_time = 800, 300, 1445
info = create_info(n_chan, 128., 'mag')
epochs = EpochsArray(np.random.rand(n_trial, n_chan, n_time),
                     info=info,
                     events=np.zeros((n_trial, 3), int))
y = np.random.randint(0, 2, n_trial)
epochs._data[y==1, :, :] += 1

# fit
td = TimeDecoding(n_jobs=-1, predict_mode = 'mean-prediction')
td.fit(epochs, y=y)

works fine but

td.predict(epochs)

returns

---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
<ipython-input-22-274c297025ed> in <module>()
     13 td = TimeDecoding(n_jobs=-1)
     14 td.fit(epochs, y=y)
---> 15 y_pred = td.predict(epochs)

/home/ubuntu/mne-python/mne/decoding/time_gen.py in predict(self, epochs)
   1339         """  # noqa
   1340         self._prep_times()
-> 1341         super(TimeDecoding, self).predict(epochs)
   1342         self._clean_times()
   1343         return self.y_pred_

/home/ubuntu/mne-python/mne/decoding/time_gen.py in predict(self, epochs)
    291             n_orig_epochs=n_orig_epochs, test_epochs=test_epochs,
    292             **dict(zip(['X', 'train_times'], _chunk_data(X, chunk))))
--> 293             for chunk in chunks)
    294 
    295         # Concatenate chunks across test time dimension.

/home/ubuntu/anaconda/lib/python2.7/site-packages/joblib/parallel.pyc in __call__(self, iterable)
    808                 # consumption.
    809                 self._iterating = False
--> 810             self.retrieve()
    811             # Make sure that we get a last message telling us we are done
    812             elapsed_time = time.time() - self._start_time

/home/ubuntu/anaconda/lib/python2.7/site-packages/joblib/parallel.pyc in retrieve(self)
    725                 job = self._jobs.pop(0)
    726             try:
--> 727                 self._output.extend(job.get())
    728             except tuple(self.exceptions) as exception:
    729                 # Stop dispatching any new job in the async callback thread

/home/ubuntu/anaconda/lib/python2.7/multiprocessing/pool.pyc in get(self, timeout)
    565             return self._value
    566         else:
--> 567             raise self._value
    568 
    569     def _set(self, i, obj):

SystemError: error return without exception set

IOError: bad message length

There is no error if I set n_jobs=1, if I reduce the dimensionality of the data (either n_chan, n_time or n_trial), and if I predict a subset: e.g.

td.predict(epochs[:596])

works fine, but from 597, it crashes

td.predict(epochs[:597])

I am running on an aws m4.4xlarge running with ubuntu anaconda and mne dev, joblib is 0.9.4.

I have no idea what to do…

Issue Analytics

State:
Created 7 years ago
Comments:30 (29 by maintainers)

Top GitHub Comments

1reaction

dengemanncommented, Apr 28, 2016

Retrospectively it would have been better to simply use joblib I think, just one module to attack on debugging.

On Thu, Apr 28, 2016 at 11:36 PM Jean-Rémi KING notifications@github.com wrote:

It’s mne.parallel joblib is fine.

FWIU, mne.parallel is trying to be too smart and wants to dump some data down on disc or something

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/3190#issuecomment-215570352

0reactions

larsonercommented, Mar 28, 2017

I think we probably won’t change our default behavior, since we usually use np.array_split hopefully the pickling overhead is small

Top Results From Across the Web

Developers - BUG: mne.parallel crashes when too much data -

I'm encountering some weird error when I try to decod too much data e.g. import numpy as np from mne import create_info, EpochsArray...

Joblib crashes after 2x n_jobs - Stack Overflow

The interesting part is it always crashes after calling pullSummaryData exactly 2* n_jobs . If n_jobs=3 , pullSummaryData will be called 6 times ......

Parallels Access repeatedly crashes after Windows Update ...

Starts out ok and stays up for 1-3 hours then crashes. ... It's not the Windows updates so much as the Parallels Access...

CE-u1 Copy Parallel causes crash - MicroStation Forum

When you can ignore the crash, the copy parallel will have a distance of zero when the cursor is over the absolute origin...

OpenFoam parallel crashes at random - Forums - CFD Online

I had once case run successfully in parallel before. ... [0] Maximum number of iterations exceeded#0 Foam::error: rintStack(Foam::Ostream&) ...