question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to avoid multiple non-thread safe code execution on module re-imports inside Parallel

See original GitHub issue

The following error(s) started to appear after updating scikit-learn to 0.20 and in consequence joblib to 0.12:

================================== FAILURES ===================================
_________________________ TestExamples.test_examples __________________________
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 420, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 563, in __call__
    return self.func(*args, **kwargs)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in __call__
    for func, args, kwargs in self.items]
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in <listcomp>
    for func, args, kwargs in self.items]
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 242, in fit_ovr_binary
    return binary_clf.fit(X, y, sample_weight)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 306, in fit
    self._execute_command(cmd)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 267, in _execute_command
    universal_newlines=True).communicate()
  File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 1155, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
"""
The above exception was the direct cause of the following exception:
self = <test_examples.TestExamples testMethod=test_examples>
    def test_examples(self):
        for filename in find_files(os.path.join(os.path.abspath(os.path.dirname(__file__)), os.path.pardir, 'examples')):
>           exec(open(filename).read(), globals())
tests\test_examples.py:17: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
<string>:16: in <module>
    ???
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py:502: in fit
    self._fit_multiclass_task(X, y, sample_weight, params)
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\rgf_model.py:377: in _fit_multiclass_task
    for i in range(self._n_classes))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:996: in __call__
    self.retrieve()
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:899: in retrieve
    self._output.extend(job.get(timeout=self.timeout))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py:517: in wrap_future_result
    return future.result(timeout=timeout)
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:432: in result
    return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x5ae8519630 state=finished raised FileNotFoundError>
    def __get_result(self):
        if self._exception:
>           raise self._exception
E           FileNotFoundError: [WinError 2] The system cannot find the file specified
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:384: FileNotFoundError
______________________ TestRGFClassfier.test_attributes _______________________
sklearn.externals.joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\externals\loky\process_executor.py", line 420, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 563, in __call__
    return self.func(*args, **kwargs)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in __call__
    for func, args, kwargs in self.items]
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py", line 261, in <listcomp>
    for func, args, kwargs in self.items]
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 242, in fit_ovr_binary
    return binary_clf.fit(X, y, sample_weight)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 306, in fit
    self._execute_command(cmd)
  File "C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py", line 267, in _execute_command
    universal_newlines=True).communicate()
  File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "C:\Miniconda36-x64\envs\test-environment\lib\subprocess.py", line 1155, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
"""
The above exception was the direct cause of the following exception:
self = <test_rgf_python.TestRGFClassfier testMethod=test_attributes>
    def test_attributes(self):
        clf = self.classifier_class(**self.kwargs)
        attributes = ('estimators_', 'classes_', 'n_classes_', 'n_features_', 'fitted_',
                      'sl2_', 'min_samples_leaf_', 'n_iter_')
    
        for attr in attributes:
            self.assertRaises(NotFittedError, getattr, clf, attr)
>       clf.fit(self.X_train, self.y_train)
tests\test_rgf_python.py:256: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\utils.py:502: in fit
    self._fit_multiclass_task(X, y, sample_weight, params)
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\rgf\rgf_model.py:377: in _fit_multiclass_task
    for i in range(self._n_classes))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:996: in __call__
    self.retrieve()
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\parallel.py:899: in retrieve
    self._output.extend(job.get(timeout=self.timeout))
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py:517: in wrap_future_result
    return future.result(timeout=timeout)
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:432: in result
    return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x5ae86fe320 state=finished raised FileNotFoundError>
    def __get_result(self):
        if self._exception:
>           raise self._exception
E           FileNotFoundError: [WinError 2] The system cannot find the file specified
C:\Miniconda36-x64\envs\test-environment\lib\concurrent\futures\_base.py:384: FileNotFoundError
---------------------------- Captured stderr call -----------------------------
ERROR: The process with PID 2448 (child process of PID 1612) could not be terminated.
Reason: The operation attempted is not supported.

I use Parallel to communicate with executable file which uses OpenMP:

https://github.com/RGF-team/rgf/blob/a7c0a5b5b51d26eac689650194867a33ad640c47/python-package/rgf/rgf_model.py#L373-L377 https://github.com/RGF-team/rgf/blob/a7c0a5b5b51d26eac689650194867a33ad640c47/python-package/rgf/utils.py#L263-L267

Failures are random: roughly 1 failure for 3 runs. It can be illustrated by the following picture (it’s 8 re-builds of the same commit):

image

Error is presented only on x64 Windows with Python 3.7. x86 Windows, Linux, macOS are bug-free. The error doesn’t appear at Python 2.7, 3.4-3.6 too.

After facing this error I’ve tried to run scikit-learn with joblib from PyPI via set SKLEARN_SITE_JOBLIB=true (version 0.12.0 - 0.12.5) - no luck.

Old stable joblib 0.11 cannot be used with the newest scikit-learn due to the following error in RandomForestRegressor:

            # Parallel loop: we prefer the threading backend as the Cython code
            # for fitting the trees is internally releasing the Python GIL
            # making threading more efficient than multiprocessing in
            # that case. However, we respect any parallel_backend contexts set
            # at a higher level, since correctness does not rely on using
            # threads.
            trees = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
>                            prefer="threads")(
                delayed(_parallel_build_trees)(
                    t, self, X, y, sample_weight, i, len(trees),
                    verbose=self.verbose, class_weight=self.class_weight)
                for i, t in enumerate(trees))
E           TypeError: __init__() got an unexpected keyword argument 'prefer'
C:\Miniconda36-x64\envs\test-environment\lib\site-packages\sklearn\ensemble\forest.py:331: TypeError

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:20 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
StrikerRUScommented, Nov 9, 2018

Thanks a lot to all participants, especially to @tomMoral ! I’ve run tests on CIs multiple times and the proposed fix solves the original problem.

I suppose that this issue can be closed now (and probably renamed with the aim to help other users find very useful information about module variables pickling).

1reaction
ogriselcommented, Nov 7, 2018

Also note that if you set OMP_NUM_THREADS explicitly to some other value, joblib will not override this and instead propagate that to the worker processes.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to build modules in parallel except for ones that depend ...
Multi -threading in Maven parallelizes the build of modules. Within a single module, everything is still done sequentially.
Read more >
Node.js multithreading: Worker threads and why they matter
We can run things in parallel, but we don't create threads or sync them. ... and does not block the execution of other...
Read more >
Using multiple threads - Godot Docs
Threads allow simultaneous execution of code. It allows off-loading work from the main thread. Godot supports threads and provides many handy functions to...
Read more >
A simple guide to JavaScript concurrency in Node.js | TSH.io
These are concurrent and parallel concurrency in Node.js. ... Here, the first read operation is triggered, but the code is not stopped.
Read more >
Node.js multithreading with worker threads series - Snyk
It's not true multi-threading, but in many situations it'll be close enough, letting you execute code in parallel outside the main thread. This ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found