question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

joblib not found when Parallel is executed inside a pex

See original GitHub issue

Trying to package joblib inside a pex, executing the script

#!/bin/bash
python3.6 -m venv .
. bin/activate
# gets pip 19.0.3
pip install --upgrade pip
pip install pex
pex joblib -o ./pex
cat <<EOF > test.py
from joblib import Parallel, delayed

def func(k):
    return k

jobs = (delayed(func)(k) for k in range(2))
print(Parallel(n_jobs=2)(jobs))

EOF
deactivate
./pex test.py

Raises

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'joblib'
/usr/local/opt/python/bin/python3.6: Error while finding module specification for 'joblib.externals.loky.backend.popen_loky_posix' (ModuleNotFoundError: No module named 'joblib')
/usr/local/opt/python/bin/python3.6: Error while finding module specification for 'joblib.externals.loky.backend.popen_loky_posix' (ModuleNotFoundError: No module named 'joblib')
/Users/j.alberdi/.pex/install/joblib-0.13.2-py2.py3-none-any.whl.71f5159401747e872566d2974ae0d2afcc42332d/joblib-0.13.2-py2.py3-none-any.whl/joblib/externals/loky/backend/semaphore_tracker.py:74: UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.
  warnings.warn('semaphore_tracker: process died unexpectedly, '
Traceback (most recent call last):
  File ".bootstrap/pex/pex.py", line 352, in execute
  File ".bootstrap/pex/pex.py", line 284, in _wrap_coverage
  File ".bootstrap/pex/pex.py", line 315, in _wrap_profiling
  File ".bootstrap/pex/pex.py", line 400, in _execute
  File ".bootstrap/pex/pex.py", line 454, in execute_interpreter
  File ".bootstrap/pex/pex.py", line 490, in execute_content
  File ".bootstrap/pex/compatibility.py", line 81, in exec_function
  File "test.py", line 7, in <module>
    print(Parallel(n_jobs=2)(jobs))
  File "/Users/j.alberdi/.pex/install/joblib-0.13.2-py2.py3-none-any.whl.71f5159401747e872566d2974ae0d2afcc42332d/joblib-0.13.2-py2.py3-none-any.whl/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/Users/j.alberdi/.pex/install/joblib-0.13.2-py2.py3-none-any.whl.71f5159401747e872566d2974ae0d2afcc42332d/joblib-0.13.2-py2.py3-none-any.whl/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/Users/j.alberdi/.pex/install/joblib-0.13.2-py2.py3-none-any.whl.71f5159401747e872566d2974ae0d2afcc42332d/joblib-0.13.2-py2.py3-none-any.whl/joblib/_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "/usr/local/Cellar/python/3.6.5_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/local/Cellar/python/3.6.5_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {EXIT(1)}
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'joblib'

It can be noticed that

  • from joblib import Parallel, delayed does not raise anything
  • the execution of the line Parallel(n_jobs=2)(jobs) raises the exception

To test the procedure to build a pex, the following test with requests runs fine

#!/bin/bash
python3.6 -m venv .
. bin/activate
pip install --upgrade pip

pip install pex
pex requests -o ./pex
cat <<EOF > test.py
import requests

print(requests.get("http://www.google.fr"))

EOF
deactivate
./pex test.py
<Response [200]>

Similarly, the script

#!/bin/bash
# python3.7 works too
python3.6 -m venv .
. bin/activate
pip install --upgrade pip
pip install pex joblib
pex joblib -o ./pex
cat <<EOF > test.py
from joblib import Parallel, delayed

def func(k):
    return k

jobs = (delayed(func)(k) for k in range(2))
print(Parallel(n_jobs=2)(jobs))

EOF
#deactivate
./pex test.py

Outputs the expected

[0, 1]

However

  • pip install joblib and more importantly
  • #deactivate

should be removed to ensure the pex could be deployed somewhere else.

Any hints on the root cause would be very welcome.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:2
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
ptulscommented, Dec 17, 2020

Has there been an update on this issue?

0reactions
thundergolfercommented, Aug 26, 2020

We’re also experiencing this issue with https://github.com/google/subpar which is similar to both Pex and Linkedin/Shiv.

Read more comments on GitHub >

github_iconTop Results From Across the Web

joblib not found when Parallel is executed inside a pex #675
Parallel raises an exception inside a pex, even though from joblib import Parallel runs fine. The script below reproduces the issue. #!/bin/bash ...
Read more >
Developers - joblib not found when Parallel is executed inside a pex -
Trying to package joblib inside a pex, executing the script #!/bin/bash python3.6 -m venv . . bin/activate # gets pip 19.0.3 pip install...
Read more >
Joblib nested Parallel execution not making use of available ...
A not-nested approach is not feasible with my specific code as the nested parallel sits in a method of a class. Any suggestions...
Read more >
joblib.Parallel — joblib 1.3.0.dev0 documentation
On some rare systems (such as Pyiodide), the loky backend may not be available. “multiprocessing” previous process-based backend based on multiprocessing. Pool ...
Read more >
Getting the Most out of scikit-learn Pipelines | by Jessica Miles
Pipeline steps are executed serially, where the output from the first step is passed to the second step, and so on. ColumnTransformers are...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found