Misleading ImportError when using Parallel inside a "with Parallel(...) as" block (backend='multiprocessing')
See original GitHub issuefrom math import sqrt
from joblib import Parallel, delayed
input_list = [x**2 for x in range(10)]
def main():
with Parallel(n_jobs=3, backend='multiprocessing') as parallel:
output = Parallel(n_jobs=2, backend='multiprocessing')(delayed(sqrt)(i) for i in input_list)
return output
if __name__ == '__main__':
print(main())
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
/tmp/test_joblib_reload.py in <module>()
12
13 if __name__ == '__main__':
---> 14 print(main())
/tmp/test_joblib_reload.py in main()
8 with Parallel(n_jobs=2) as parallel:
---> 9 output = Parallel(n_jobs=2)(delayed(sqrt)(i) for i in input_list)
10 return output
11
/home/lesteve/miniconda3/lib/python3.5/site-packages/joblib/parallel.py in __call__(self, iterable)
764 self._aborting = False
765 if not self._managed_pool:
--> 766 n_jobs = self._initialize_pool()
767 else:
768 n_jobs = self._effective_n_jobs()
/home/lesteve/miniconda3/lib/python3.5/site-packages/joblib/parallel.py in _initialize_pool(self)
513 already_forked = int(os.environ.get(JOBLIB_SPAWNED_PROCESS, 0))
514 if already_forked:
--> 515 raise ImportError('[joblib] Attempting to do parallel computing '
516 'without protecting your import on a system that does '
517 'not support forking. To use parallel-computing in a '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
It is misleading because 1) I am on Linux so my system supports forking 2) I am using a ifmain guard. Not sure what we should do here and if there is an easy way to detect this.
For completeness, I originally saw the error in https://github.com/scikit-learn/scikit-learn/issues/6258 and only found time to trace it back recently.
Issue Analytics
- State:
- Created 7 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
joblib/test_parallel.py at master - GitHub
with parallel_backend("threading", n_jobs=backend_n_jobs):. # when using a backend, the default of number jobs will be the one set. # in the backend.
Read more >Embarrassingly parallel for loops - Joblib - Read the Docs
Joblib provides a simple helper class to write parallel for loops using multiprocessing. The core idea is to write the code to be...
Read more >Error pickling a `matlab` object in joblib `Parallel` context
The error is caused by incorrect loading order of global objects in the child processes. It can be seen clearly in the traceback ......
Read more >parallel.py
when used in conjunction `Parallel(backend='threading')`. """ # Try to pickle the input function, to catch the problems early when # using with ...
Read more >joblib Documentation - Read the Docs
Note that here max_nbytes=None is used to disable the auto-dumping feature of Parallel. small_array is still in shared memory in the worker ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I see this issue on macOS in a Jupyter notebook working with scikit-learn and multicore processing. This MWE tickles the issue:
The code runs as expected when run as a Python script.
However, under a Jupyter notebook, it throws this error:
I’m not sure if this is an issue with joblib or some downstream issue with Jupyter, but it is a problem.
The very same issue arises in related libraries, such as HDBSCAN. The following alternate code tickles the issue with HDBSCAN:
I can reproduce this error with this line in Jupyter on MacOS
Changing
n_jobsto 1 fixes it.sklearn 0.19.0 with joblib 0.11 jupyter 5.1.0 python 3.5.1 MacOS 10.13.4
Edit:
Restarting the Jupyter notebook fixes it.