Incompatibility with joblib
See original GitHub issuejoblib with its loky
backend is widely used for distributed computing. Without going into the details the loky
backend has a lot of advantages over the multiprocessing
backend.
Loguru does not seem to work with jobli
/loky
because of a pickle issue. I tried to apply the hints from the doc without success.
I was wondering whether it’s possible to make loguru compatible with joblib:
import sys
from joblib import Parallel, delayed
from loguru import logger
def func_async():
logger.info("Hello")
# logger.remove()
# logger.add(sys.stderr, enqueue=True)
args = [delayed(func_async)() for _ in range(100)]
p = Parallel(n_jobs=16, backend="loky")
results = p(args)
The error:
---------------------------------------------------------------------------
_RemoteTraceback Traceback (most recent call last)
_RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/loky/backend/queues.py", line 153, in _feed
obj_ = dumps(obj, reducers=reducers)
File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/loky/backend/reduction.py", line 271, in dumps
dump(obj, buf, reducers=reducers, protocol=protocol)
File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/loky/backend/reduction.py", line 264, in dump
_LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
File "/home/hadim/local/conda/envs/circus/lib/python3.8/site-packages/joblib/externals/cloudpickle/cloudpickle_fast.py", line 563, in dump
return Pickler.dump(self, obj)
TypeError: cannot pickle '_thread.lock' object
"""
The above exception was the direct cause of the following exception:
PicklingError Traceback (most recent call last)
/tmp/ipykernel_2094805/2576140850.py in <module>
13
14 p = Parallel(n_jobs=16, backend="loky")
---> 15 results = p(args)
~/local/conda/envs/circus/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)
1052
1053 with self._backend.retrieval_context():
-> 1054 self.retrieve()
1055 # Make sure that we get a last message telling us we are done
1056 elapsed_time = time.time() - self._start_time
~/local/conda/envs/circus/lib/python3.8/site-packages/joblib/parallel.py in retrieve(self)
931 try:
932 if getattr(self._backend, 'supports_timeout', False):
--> 933 self._output.extend(job.get(timeout=self.timeout))
934 else:
935 self._output.extend(job.get())
~/local/conda/envs/circus/lib/python3.8/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
540 AsyncResults.get from multiprocessing."""
541 try:
--> 542 return future.result(timeout=timeout)
543 except CfTimeoutError as e:
544 raise TimeoutError from e
~/local/conda/envs/circus/lib/python3.8/concurrent/futures/_base.py in result(self, timeout)
442 raise CancelledError()
443 elif self._state == FINISHED:
--> 444 return self.__get_result()
445 else:
446 raise TimeoutError()
~/local/conda/envs/circus/lib/python3.8/concurrent/futures/_base.py in __get_result(self)
387 if self._exception:
388 try:
--> 389 raise self._exception
390 finally:
391 # Break a reference cycle with the exception in self._exception
PicklingError: Could not pickle the task to send it to the workers.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (3 by maintainers)
Top Results From Across the Web
Persistence — joblib 1.3.0.dev0 documentation - Read the Docs
Compatibility of joblib pickles across python versions is not fully supported. Note that, for a very restricted set of objects, this may appear...
Read more >Python 2 / 3 incompatibility when fetching joblib compressed ...
For instance when running a Python 2 script that loads the olivetti dataset when it has already been loaded with Python 3 in...
Read more >Update scikit model so it is compatible with newest version
I have a model (saved using joblib) created in Python 3.5 from scikit-learn 0.21.2, which I then analyze with the package shap version...
Read more >Azure-core or AzureML version packages incompatibility
import os; import numpy as np; import pandas as pd; import pickle; import sklearn; import joblib; import math; from sklearn.model_selection ...
Read more >Release History — scikit-learn 0.21.3 documentation
The v0.20.0 release notes failed to mention a backwards incompatibility in ... Enhancement Joblib is no longer vendored in scikit-learn, and becomes a ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So, I was able to reproduce the issue using a Jupyter notebook.
I can’t tell if this is related to the problem you’re facing in production.
This happens due to an internal
Lock
used bysys.stderr
, I guess. Here is a reproducible example without involving Loguru:The thing is that the
logger
is configured with thesys.stderr
handler by default which can’t be pickled.I see three possible workarounds:
sys.stderr
before running the job and thus avoiding the need to pickle it. In this case, each worker will have its own handler but shared resources won’t be protected, which is not ideal.logger.add(lambda m: sys.stderr.write(m))
after callinglogger.remove()
seems to do the trick. Again, at worker initialization, handler will be deep-copied during pickling, sosys.stderr
won’t be protected from parallel access.logger
and its handlers by making the workers inherit from it instead of pickling it (just like it’s done withargs
ofProcess
).I don’t know Joblib API, maybe you can find a clean way to pass the handler by inheritance without pickle?
I’m having the same issue in scripts (not Jupyter).