Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Random timeout in test_nested_exception_dispatch

See original GitHub issue

This was observed in the linux_pypy3 CI of #1018.

=================================== FAILURES ===================================
_______________ test_nested_exception_dispatch[multiprocessing] ________________
multiprocessing.pool.RemoteTraceback: 
"""
block      = True
lock       = <locked _thread.lock object at 0x0000000003f453b8>
self       = <Thread(Thread-326, started daemon 140714991130368)>
timeout    = -1

pypy3.6-v7.3.1-linux64/lib-python/3/threading.py:1072: Failed

Here is part of the capture stderr:

----------------------------- Captured stderr call -----------------------------
[DEBUG:MainProcess:MainThread] created semlock with handle 140715172077568
[DEBUG:MainProcess:MainThread] created semlock with handle 140715172073472
[DEBUG:MainProcess:MainThread] created semlock with handle 140715172069376
[DEBUG:MainProcess:MainThread] joining task handler

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~ Stack of Thread-326 (140714991130368) ~~~~~~~~~~~~~~~~~~~~~
  File "/home/vsts/work/1/s/pypy3.6-v7.3.1-linux64/lib-python/3/threading.py", line 884, in _bootstrap
    self._bootstrap_inner()
  File "/home/vsts/work/1/s/pypy3.6-v7.3.1-linux64/lib-python/3/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/vsts/work/1/s/pypy3.6-v7.3.1-linux64/lib-python/3/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/vsts/work/1/s/pypy3.6-v7.3.1-linux64/lib-python/3/multiprocessing/pool.py", line 446, in _handle_tasks
    outqueue.put(None)
  File "/home/vsts/work/1/s/joblib/pool.py", line 169, in put
    wlock_acquire()

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
[DEBUG:MainProcess:MainThread] closing pool
[DEBUG:MainProcess:MainThread] terminating pool

Issue Analytics

State:
Created 3 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

ogriselcommented, Apr 22, 2020

If it’s very rare it’s fine. I just opened the issue to be able to accumulate additional evidence in case it’s reproduced later on other platforms meaning it’s a real bug and not just a slow PyPy CI host.

0reactions

ogriselcommented, May 4, 2020

I observed it once locally on Linux with CPython 3.8.1…

=================================================================================== FAILURES ===================================================================================
_______________________________________________________________ test_nested_exception_dispatch[multiprocessing] ________________________________________________________________
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/ogrisel/code/joblib/joblib/_parallel_backends.py", line 624, in __call__
    return self.func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in __call__
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/test/test_parallel.py", line 527, in nested_function_outer
    Parallel(n_jobs=2)(
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 1005, in __call__
    self.retrieve()
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 905, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/ogrisel/code/joblib/joblib/_parallel_backends.py", line 624, in __call__
    return self.func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in __call__
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/test/test_parallel.py", line 522, in nested_function_inner
    Parallel(n_jobs=2)(
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 995, in __call__
    while self.dispatch_one_batch(iterator):
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 831, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 750, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/ogrisel/code/joblib/joblib/_parallel_backends.py", line 209, in apply_async
    result = ImmediateResult(func)
  File "/home/ogrisel/code/joblib/joblib/_parallel_backends.py", line 601, in __init__
    self.results = batch()
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in __call__
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/parallel.py", line 249, in <listcomp>
    return [func(*args, **kwargs)
  File "/home/ogrisel/code/joblib/joblib/test/test_parallel.py", line 102, in exception_raiser
    raise (MyExceptionWithFinickyInit('a', 'b', 'c', 'd')
ValueError
"""

The above exception was the direct cause of the following exception:

self = Parallel(n_jobs=2)

    def retrieve(self):
        self._output = list()
        while self._iterating or len(self._jobs) > 0:
            if len(self._jobs) == 0:
                # Wait for an async callback to dispatch new jobs
                time.sleep(0.01)
                continue
            # We need to be careful: the job list can be filling up as
            # we empty it and Python list are not thread-safe by default hence
            # the use of the lock
            with self._lock:
                job = self._jobs.pop(0)
    
            try:
                if getattr(self._backend, 'supports_timeout', False):
>                   self._output.extend(job.get(timeout=self.timeout))

backend    = <joblib._parallel_backends.MultiprocessingBackend object at 0x7f3176b2ebb0>
ensure_ready = False
job        = <multiprocessing.pool.ApplyResult object at 0x7f3180e03100>
self       = Parallel(n_jobs=2)

joblib/parallel.py:905: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <multiprocessing.pool.ApplyResult object at 0x7f3180e03100>, timeout = None

    def get(self, timeout=None):
        self.wait(timeout)
        if not self.ready():
            raise TimeoutError
        if self._success:
            return self._value
        else:
>           raise self._value
E           ValueError

self       = <multiprocessing.pool.ApplyResult object at 0x7f3180e03100>
timeout    = None

../../miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py:768: ValueError

During handling of the above exception, another exception occurred:

backend = 'multiprocessing'

    @with_multiprocessing
    @parametrize('backend', PARALLEL_BACKENDS)
    def test_nested_exception_dispatch(backend):
        """Ensure errors for nested joblib cases gets propagated
    
        We rely on the Python 3 built-in __cause__ system that already
        report this kind of information to the user.
        """
        with raises(ValueError) as excinfo:
>           Parallel(n_jobs=2, backend=backend)(
                delayed(nested_function_outer)(i) for i in range(30))

backend    = 'multiprocessing'
excinfo    = <ExceptionInfo for raises contextmanager>

joblib/test/test_parallel.py:540: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
joblib/parallel.py:1005: in __call__
    self.retrieve()
        backend_name = 'MultiprocessingBackend'
        iterable   = <generator object test_nested_exception_dispatch.<locals>.<genexpr> at 0x7f3180e47ba0>
        iterator   = <itertools.islice object at 0x7f3180d129f0>
        n_jobs     = 2
        pre_dispatch = 4
        self       = Parallel(n_jobs=2)
joblib/parallel.py:927: in retrieve
    backend.abort_everything(ensure_ready=ensure_ready)
        backend    = <joblib._parallel_backends.MultiprocessingBackend object at 0x7f3176b2ebb0>
        ensure_ready = False
        job        = <multiprocessing.pool.ApplyResult object at 0x7f3180e03100>
        self       = Parallel(n_jobs=2)
joblib/_parallel_backends.py:258: in abort_everything
    self.terminate()
        ensure_ready = False
        self       = <joblib._parallel_backends.MultiprocessingBackend object at 0x7f3176b2ebb0>
joblib/_parallel_backends.py:493: in terminate
    super(MultiprocessingBackend, self).terminate()
        __class__  = <class 'joblib._parallel_backends.MultiprocessingBackend'>
        self       = <joblib._parallel_backends.MultiprocessingBackend object at 0x7f3176b2ebb0>
joblib/_parallel_backends.py:244: in terminate
    self._pool.terminate()  # terminate does a join()
        self       = <joblib._parallel_backends.MultiprocessingBackend object at 0x7f3176b2ebb0>
joblib/pool.py:317: in terminate
    super(MemmappingPool, self).terminate()
        __class__  = <class 'joblib.pool.MemmappingPool'>
        i          = 0
        n_retries  = 10
        self       = <joblib.pool.MemmappingPool state=TERMINATE pool_size=2>
../../miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py:656: in terminate
    self._terminate()
        self       = <joblib.pool.MemmappingPool state=TERMINATE pool_size=2>
../../miniconda3/envs/pylatest/lib/python3.8/multiprocessing/util.py:201: in __call__
    res = self._callback(*self._args, **self._kwargs)
        _finalizer_registry = {(20, 15): <Finalize object, dead>, (None, 828): <Finalize object, callback=close_fds, args=(19, 23)>, (None, 829): <Finalize object, callback=close_fds, args=(21, 25)>}
        getpid     = <built-in function getpid>
        self       = <Finalize object, callback=_terminate_pool, args=(<_queue.SimpleQueue object at 0x7f3180d12cc0>, <joblib.pool.Customiz...Result object at 0x7f3180d11070>, 5529: <multiprocessing.pool.ApplyResult object at 0x7f3180d11610>}), exitpriority=15>
        sub_debug  = <function sub_debug at 0x7f318949fdc0>
        wr         = None
../../miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py:714: in _terminate_pool
    task_handler.join()
        cache      = {5527: <multiprocessing.pool.ApplyResult object at 0x7f3180e03f70>, 5528: <multiprocessing.pool.ApplyResult object at 0x7f3180d11070>, 5529: <multiprocessing.pool.ApplyResult object at 0x7f3180d11610>}
        change_notifier = <multiprocessing.queues.SimpleQueue object at 0x7f31810ae430>
        cls        = <class 'joblib.pool.MemmappingPool'>
        inqueue    = <joblib.pool.CustomizablePicklingQueue object at 0x7f3180e2c9a0>
        outqueue   = <joblib.pool.CustomizablePicklingQueue object at 0x7f31810bc340>
        p          = <ForkProcess name='ForkPoolWorker-289' pid=3547 parent=1597 stopped exitcode=-SIGTERM daemon>
        pool       = [<ForkProcess name='ForkPoolWorker-288' pid=3546 parent=1597 stopped exitcode=-SIGTERM daemon>, <ForkProcess name='ForkPoolWorker-289' pid=3547 parent=1597 stopped exitcode=-SIGTERM daemon>]
        result_handler = <Thread(Thread-481, stopped daemon 139850402957056)>
        task_handler = <Thread(Thread-480, started daemon 139850588718848)>
        taskqueue  = <_queue.SimpleQueue object at 0x7f3180d12cc0>
        worker_handler = <Thread(Thread-479, stopped daemon 139849906517760)>
../../miniconda3/envs/pylatest/lib/python3.8/threading.py:1011: in join
    self._wait_for_tstate_lock()
        self       = <Thread(Thread-480, started daemon 139850588718848)>
        timeout    = None
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Thread(Thread-480, started daemon 139850588718848)>, block = True, timeout = -1

    def _wait_for_tstate_lock(self, block=True, timeout=-1):
        # Issue #18808: wait for the thread state to be gone.
        # At the end of the thread's life, after all knowledge of the thread
        # is removed from C data structures, C code releases our _tstate_lock.
        # This method passes its arguments to _tstate_lock.acquire().
        # If the lock is acquired, the C code is done, and self._stop() is
        # called.  That sets ._is_stopped to True, and ._tstate_lock to None.
        lock = self._tstate_lock
        if lock is None:  # already determined that the C code is done
            assert self._is_stopped
>       elif lock.acquire(block, timeout):
E       Failed: Timeout >30.0s

block      = True
lock       = <locked _thread.lock object at 0x7f3180e032d0>
self       = <Thread(Thread-480, started daemon 139850588718848)>
timeout    = -1

../../miniconda3/envs/pylatest/lib/python3.8/threading.py:1027: Failed
----------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781663232
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781659136
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781212672
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781208576
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781204480
[DEBUG:MainProcess:MainThread] created semlock with handle 139850781200384
[DEBUG:MainProcess:MainThread] added worker
[INFO:ForkPoolWorker-288:MainThread] child process calling self.run()
[DEBUG:MainProcess:MainThread] added worker
[INFO:ForkPoolWorker-289:MainThread] child process calling self.run()
[DEBUG:ForkPoolWorker-288:MainThread] created semlock with handle 139850751283200
[DEBUG:ForkPoolWorker-289:MainThread] created semlock with handle 139850751283200
[DEBUG:ForkPoolWorker-289:MainThread] created semlock with handle 139850751279104
[DEBUG:ForkPoolWorker-288:MainThread] created semlock with handle 139850751279104
[DEBUG:ForkPoolWorker-289:MainThread] added worker
[DEBUG:ForkPoolWorker-288:MainThread] added worker
[DEBUG:ForkPoolWorker-288:MainThread] added worker
[DEBUG:ForkPoolWorker-289:MainThread] added worker
[DEBUG:ForkPoolWorker-288:MainThread] closing pool
[DEBUG:ForkPoolWorker-288:MainThread] terminating pool
[DEBUG:ForkPoolWorker-288:MainThread] finalizing pool
[DEBUG:ForkPoolWorker-288:MainThread] helping task handler/workers to finish
[DEBUG:ForkPoolWorker-288:MainThread] joining worker handler
[DEBUG:ForkPoolWorker-288:Thread-479] worker got sentinel -- exiting
[DEBUG:ForkPoolWorker-288:Thread-483] result handler found thread._state=TERMINATE
[DEBUG:ForkPoolWorker-288:Thread-480] worker got sentinel -- exiting
[DEBUG:ForkPoolWorker-288:Thread-481] worker handler exiting
[DEBUG:ForkPoolWorker-288:Thread-479] worker exiting after 4 tasks
[DEBUG:ForkPoolWorker-288:Thread-482] task handler got sentinel
[DEBUG:ForkPoolWorker-288:Thread-483] result handler exiting: len(cache)=0, thread._state=TERMINATE
[DEBUG:ForkPoolWorker-288:Thread-480] worker exiting after 0 tasks
[DEBUG:ForkPoolWorker-288:MainThread] joining task handler
[DEBUG:ForkPoolWorker-288:Thread-482] task handler sending sentinel to result handler
[DEBUG:ForkPoolWorker-288:Thread-482] task handler sending sentinel to workers
[DEBUG:ForkPoolWorker-288:Thread-482] task handler exiting
[DEBUG:ForkPoolWorker-288:MainThread] joining result handler
[DEBUG:ForkPoolWorker-289:MainThread] closing pool
[DEBUG:ForkPoolWorker-289:MainThread] terminating pool
[DEBUG:ForkPoolWorker-289:MainThread] finalizing pool
[DEBUG:ForkPoolWorker-289:MainThread] helping task handler/workers to finish
[DEBUG:ForkPoolWorker-289:MainThread] joining worker handler
[DEBUG:ForkPoolWorker-289:Thread-483] result handler found thread._state=TERMINATE
[DEBUG:ForkPoolWorker-289:Thread-483] result handler exiting: len(cache)=0, thread._state=TERMINATE
[DEBUG:ForkPoolWorker-289:Thread-479] worker got sentinel -- exiting
[DEBUG:ForkPoolWorker-289:Thread-481] worker handler exiting
[DEBUG:MainProcess:MainThread] closing pool
[DEBUG:ForkPoolWorker-289:Thread-480] worker got sentinel -- exiting
[DEBUG:ForkPoolWorker-289:Thread-482] task handler got sentinel
[DEBUG:MainProcess:MainThread] terminating pool
[DEBUG:ForkPoolWorker-289:Thread-479] worker exiting after 4 tasks
[DEBUG:ForkPoolWorker-289:MainThread] joining task handler
[DEBUG:ForkPoolWorker-288:MainThread] created semlock with handle 139850596077568
[DEBUG:ForkPoolWorker-289:Thread-480] worker exiting after 0 tasks
[DEBUG:MainProcess:MainThread] finalizing pool
[DEBUG:ForkPoolWorker-289:Thread-482] task handler sending sentinel to result handler
[DEBUG:MainProcess:MainThread] helping task handler/workers to finish
[DEBUG:ForkPoolWorker-289:Thread-482] task handler sending sentinel to workers
[DEBUG:ForkPoolWorker-289:Thread-482] task handler exiting
[DEBUG:ForkPoolWorker-288:MainThread] created semlock with handle 139850596073472
[DEBUG:MainProcess:MainThread] removing tasks from inqueue until task handler finished
[DEBUG:ForkPoolWorker-289:MainThread] joining result handler
[DEBUG:ForkPoolWorker-288:MainThread] added worker
[DEBUG:MainProcess:Thread-481] result handler found thread._state=TERMINATE
[DEBUG:ForkPoolWorker-288:MainThread] added worker
[DEBUG:MainProcess:Thread-481] ensuring that outqueue is not full
[DEBUG:MainProcess:Thread-481] result handler exiting: len(cache)=3, thread._state=TERMINATE
[DEBUG:MainProcess:Thread-479] worker handler exiting
[DEBUG:MainProcess:MainThread] joining worker handler
[DEBUG:MainProcess:MainThread] terminating workers
[DEBUG:MainProcess:MainThread] joining task handler
[DEBUG:MainProcess:Thread-480] task handler got sentinel
[DEBUG:MainProcess:Thread-480] task handler sending sentinel to result handler
[DEBUG:MainProcess:Thread-433] worker got sentinel -- exiting
[DEBUG:MainProcess:Thread-433] worker exiting after 1 tasks
[DEBUG:MainProcess:Thread-434] worker got sentinel -- exiting
[DEBUG:MainProcess:Thread-434] worker exiting after 1 tasks

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~ Stack of Thread-480 (139850588718848) ~~~~~~~~~~~~~~~~~~~~~
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ogrisel/miniconda3/envs/pylatest/lib/python3.8/multiprocessing/pool.py", line 559, in _handle_tasks
    outqueue.put(None)
  File "/home/ogrisel/code/joblib/joblib/pool.py", line 169, in put
    wlock_acquire()

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
[DEBUG:MainProcess:MainThread] closing pool
[DEBUG:MainProcess:MainThread] terminating pool
[DEBUG:MainProcess:MainThread] Sucessfully deleted /dev/shm/joblib_memmapping_folder_1597_139850592470208
=========================================================================== short test summary info ============================================================================
FAILED joblib/test/test_parallel.py::test_nested_exception_dispatch[multiprocessing] - Failed: Timeout >30.0s