How to avoid multiple non-thread safe code execution on module re-imports inside Parallel
See original GitHub issueThe following error(s) started to appear after updating scikit-learn
to 0.20 and in consequence joblib
to 0.12:
================================== FAILURES ===================================
_________________________ TestExamples.test_examples __________________________
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 420, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 563, in __call__
return self.func(*args, **kwargs)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in __call__
for func, args, kwargs in self.items]
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in <listcomp>
for func, args, kwargs in self.items]
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 242, in fit_ovr_binary
return binary_clf.fit(X, y, sample_weight)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 306, in fit
self._execute_command(cmd)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 267, in _execute_command
universal_newlines=True).communicate()
File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 756, in __init__
restore_signals, start_new_session)
File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 1155, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
"""
The above exception was the direct cause of the following exception:
self = <test_examples.TestExamples testMethod=test_examples>
def test_examples(self):
for filename in find_files(os.path.join(os.path.abspath(os.path.dirname(__file__)), os.path.pardir, 'examples')):
> exec(open(filename).read(), globals())
tests\test_examples.py:17:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
<string>:16: in <module>
???
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py:502: in fit
self._fit_multiclass_task(X, y, sample_weight, params)
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\rgf_model.py:377: in _fit_multiclass_task
for i in range(self._n_classes))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:996: in __call__
self.retrieve()
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:899: in retrieve
self._output.extend(job.get(timeout=self.timeout))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py:517: in wrap_future_result
return future.result(timeout=timeout)
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:432: in result
return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x5ae8519630 state=finished raised FileNotFoundError>
def __get_result(self):
if self._exception:
> raise self._exception
E FileNotFoundError: [WinError 2] The system cannot find the file specified
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:384: FileNotFoundError
______________________ TestRGFClassfier.test_attributes _______________________
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 420, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 563, in __call__
return self.func(*args, **kwargs)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in __call__
for func, args, kwargs in self.items]
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in <listcomp>
for func, args, kwargs in self.items]
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 242, in fit_ovr_binary
return binary_clf.fit(X, y, sample_weight)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 306, in fit
self._execute_command(cmd)
File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 267, in _execute_command
universal_newlines=True).communicate()
File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 756, in __init__
restore_signals, start_new_session)
File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 1155, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
"""
The above exception was the direct cause of the following exception:
self = <test_rgf_python.TestRGFClassfier testMethod=test_attributes>
def test_attributes(self):
clf = self.classifier_class(**self.kwargs)
attributes = ('estimators_', 'classes_', 'n_classes_', 'n_features_', 'fitted_',
'sl2_', 'min_samples_leaf_', 'n_iter_')
for attr in attributes:
self.assertRaises(NotFittedError, getattr, clf, attr)
> clf.fit(self.X_train, self.y_train)
tests\test_rgf_python.py:256:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py:502: in fit
self._fit_multiclass_task(X, y, sample_weight, params)
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\rgf_model.py:377: in _fit_multiclass_task
for i in range(self._n_classes))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:996: in __call__
self.retrieve()
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:899: in retrieve
self._output.extend(job.get(timeout=self.timeout))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py:517: in wrap_future_result
return future.result(timeout=timeout)
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:432: in result
return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x5ae86fe320 state=finished raised FileNotFoundError>
def __get_result(self):
if self._exception:
> raise self._exception
E FileNotFoundError: [WinError 2] The system cannot find the file specified
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:384: FileNotFoundError
---------------------------- Captured stderr call -----------------------------
ERROR: The process with PID 2448 (child process of PID 1612) could not be terminated.
Reason: The operation attempted is not supported.
I use Parallel to communicate with executable file which uses OpenMP:
https://github.com/RGF-team/rgf/blob/a7c0a5b5b51d26eac689650194867a33ad640c47/python-package/rgf/rgf_model.py#L373-L377 https://github.com/RGF-team/rgf/blob/a7c0a5b5b51d26eac689650194867a33ad640c47/python-package/rgf/utils.py#L263-L267
Failures are random: roughly 1 failure for 3 runs. It can be illustrated by the following picture (it’s 8 re-builds of the same commit):
Error is presented only on x64 Windows with Python 3.7. x86 Windows, Linux, macOS are bug-free. The error doesn’t appear at Python 2.7, 3.4-3.6 too.
After facing this error I’ve tried to run scikit-learn with joblib from PyPI via set SKLEARN_SITE_JOBLIB=true
(version 0.12.0 - 0.12.5) - no luck.
Old stable joblib 0.11 cannot be used with the newest scikit-learn due to the following error in RandomForestRegressor:
# Parallel loop: we prefer the threading backend as the Cython code
# for fitting the trees is internally releasing the Python GIL
# making threading more efficient than multiprocessing in
# that case. However, we respect any parallel_backend contexts set
# at a higher level, since correctness does not rely on using
# threads.
trees = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
> prefer="threads")(
delayed(_parallel_build_trees)(
t, self, X, y, sample_weight, i, len(trees),
verbose=self.verbose, class_weight=self.class_weight)
for i, t in enumerate(trees))
E TypeError: __init__() got an unexpected keyword argument 'prefer'
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\ensemble\forest.py:331: TypeError
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:20 (11 by maintainers)
Top GitHub Comments
Thanks a lot to all participants, especially to @tomMoral ! I’ve run tests on CIs multiple times and the proposed fix solves the original problem.
I suppose that this issue can be closed now (and probably renamed with the aim to help other users find very useful information about module variables pickling).
Also note that if you set
OMP_NUM_THREADS
explicitly to some other value, joblib will not override this and instead propagate that to the worker processes.