question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parallel hangs when using Dask as a back-end

See original GitHub issue

Hello,

I have been using Dask through joblib and had somes problem of deadlocks for the execution of some tasks.

I am using Parallel with Dask as a backend. Parallel.__call__ does not returns. After having modified some of the code with logs in scikit-learn 0.20.x distribution of joblib and distributed master; what I observed is that Tasks are not submitted at one point. Apart from that it seems that tasks on workers are executed properly.

I have inspected Parallel code, I think that I may have found a possible explanation of possible deadlocks.


The dispatch of tasks in made as follows in Parallel.__call__:

if self.dispatch_one_batch(iterator):
    self._iterating = self._original_iterator is not None

while self.dispatch_one_batch(iterator):
    logging.info("Parallel.__call__ dispatched one batch")
    pass

in Parallel.dispatch_one_batch Parallel._dispatch is called. As indicated in the docstring Parallel._dispatchthis method is not thread-safe: it should be only called indirectly via dispatch_one_batch”.

_dispatch does use BatchCompletionCallBack’s objects that are called by the backendapply_async. In the case of SequentialBackend’s, the execution is made in the same thread so there are no problems ; but in the case of Dask, DaskDistributedBackend.apply_async does register this callback on tornado.IOLoop:

class DaskDistributedBackend(ParallelBackendBase, AutoBatchingMixin):

    # ...

    def apply_async(self, func, callback=None):
        key = '%s-batch-%s' % (_funcname(func), uuid4().hex)
        func, args = self._to_func_args(func)

        future = self.client.submit(func, *args, key=key, **self.submit_kwargs)
        self.task_futures.add(future)

        @gen.coroutine
        def callback_wrapper():
            result = yield _wait([future])
            self.task_futures.remove(future)
            if callback is not None:
                callback(result)  # gets called in separate thread

        self.client.loop.add_callback(callback_wrapper)
        # ... 

And as indicated, it “gets called in another thread” ; hence possibly producing a dead-lock if the lock is already taken in the main thread.

I am not entirely sure this could be a valid explanation. Also, I do not see a way to do handle the callback differently in DaskDistributedBackend, nor do I understand all the moving parts of this connector.

I am inspecting it currently. I’ll be glad to have your thought on it, @ogrisel and @mrocklin. While I can hardly give a minimal complete reproducible example with my setup (sorry), I think that the problem goes hand in hand with specific workloads and can be present in a variety of setups and observed in a variety of errors that are hard to predict.

However, you will find logs that present the problem ; I hope they are useful.

Please let me know if you need some more clues.


Setup:

Python version: 2.7.5 (default, Apr  9 2019, 14:30:50)
Dask version: 1.2.2+10.gd53b560
Distributed version: 1.28.1+26.gb876796.dirty
Client-side logs
[2019-05-17 08:18:35,615] [22/MainThread] [INFO] [root] Setting up Dask
[2019-05-17 08:18:35,616] [22/MainThread] [INFO] [root] Python      version:
[2019-05-17 08:18:35,616] [22/MainThread] [INFO] [root] 2.7.5 (default, Apr  9 2019, 14:30:50)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
[2019-05-17 08:18:35,616] [22/MainThread] [INFO] [root] Dask        version: 1.2.2+10.gd53b560
[2019-05-17 08:18:35,616] [22/MainThread] [INFO] [root] Distributed version: 1.28.1+26.gb876796.dirty
[2019-05-17 08:18:46,970] [22/MainThread] [INFO] [root] Dask Scheduler started on port 8786
[2019-05-17 08:18:46,971] [22/MainThread] [INFO] [root] Done with Dask Scheduler Setup
[2019-05-17 08:18:47,302] [22/Thread-1] [INFO] [root] Scheduler.__init__ called
[2019-05-17 08:18:47,307] [22/Thread-1] [INFO] [root] Scheduler._setup_logging called
[2019-05-17 08:18:48,938] [22/Thread-1] [INFO] [root] Scheduler.start called
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO -   Scheduler at:     tcp://10.12.3.73:8786
distributed.scheduler - INFO -       bokeh at:                     :8787
[2019-05-17 08:18:49,069] [22/Thread-1] [INFO] [root] Scheduler.finished called
[2019-05-17 08:18:49,074] [22/Thread-1] [INFO] [root] Scheduler.add_worker called
[2019-05-17 08:18:49,074] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
distributed.scheduler - INFO - Register tcp://10.12.3.74:33599
[2019-05-17 08:18:49,077] [22/Thread-1] [INFO] [root] Scheduler.handle_worker called
distributed.scheduler - INFO - Starting worker compute stream, tcp://10.12.3.74:33599
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:49,087] [22/Thread-1] [INFO] [root] Scheduler.add_worker called
[2019-05-17 08:18:49,088] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
distributed.scheduler - INFO - Register tcp://10.12.3.75:42309
[2019-05-17 08:18:49,091] [22/Thread-1] [INFO] [root] Scheduler.handle_worker called
distributed.scheduler - INFO - Starting worker compute stream, tcp://10.12.3.75:42309
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:49,095] [22/Thread-1] [INFO] [root] Scheduler.add_worker called
[2019-05-17 08:18:49,096] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
distributed.scheduler - INFO - Register tcp://10.12.3.76:36063
[2019-05-17 08:18:49,099] [22/Thread-1] [INFO] [root] Scheduler.handle_worker called
[2019-05-17 08:18:49,099] [22/MainThread] [INFO] [root] Checking series of type: float64 (isM8=False)
distributed.scheduler - INFO - Starting worker compute stream, tcp://10.12.3.76:36063
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:49,212] [22/MainThread] [INFO] [root] Creating Dask client
[2019-05-17 08:18:49,233] [22/Thread-1] [INFO] [root] Scheduler.add_client called
distributed.scheduler - INFO - Receive client connection: Client-65e26f35-787c-11e9-8016-0a580a0c0349
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] register_parallel_backend called
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] Creating joblib.Parallel(n_jobs=4, verbose=100, pre_dispatch=n_jobs, backend=dask)
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] Parallel.__init__ called
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] Worker.get_client called
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,242] [22/MainThread] [INFO] [root] Parallel.__call__ called
[2019-05-17 08:18:49,243] [22/MainThread] [INFO] [root] Parallel._initialize_backend called
[2019-05-17 08:18:49,243] [22/MainThread] [INFO] [root] DaskDistributedBackend.configure called
[2019-05-17 08:18:49,243] [22/MainThread] [INFO] [root] DaskDistributedBackend.effective_n_jobs called
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] Parallel._print called
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] DaskDistributedBackend.start_call called
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] Parallel.__init__: iterator created
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch called
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: took the lock
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] DaskDistributedBackend.get_nested_backend called
[2019-05-17 08:18:49,251] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:49,252] [22/MainThread] [INFO] [root] delayed called
[2019-05-17 08:18:49,254] [22/MainThread] [INFO] [root] Parallel.dispatch: calling self._dispatch
[2019-05-17 08:18:49,254] [22/MainThread] [INFO] [root] Parallel._dispatch called
[2019-05-17 08:18:49,254] [22/MainThread] [INFO] [root] Parallel._dispatch: submitting job of id 0 to back-end
[2019-05-17 08:18:49,254] [22/MainThread] [INFO] [root] DaskDistributedBackend.apply_async called
[2019-05-17 08:18:49,254] [22/MainThread] [INFO] [root] DaskDistributedBackend._to_func_args called
[2019-05-17 08:18:49,255] [22/MainThread] [INFO] [root] in '_dask._to_func_args'
[2019-05-17 08:18:49,255] [22/MainThread] [INFO] [root] Looping on func.items
[2019-05-17 08:18:49,255] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:49,262] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,270] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:49,280] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:49,280] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,280] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,280] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,294] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:49,303] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,308] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:49,312] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:49,312] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,312] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,313] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,322] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:49,328] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,331] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:49,334] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:49,334] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,334] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,334] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,344] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:49,351] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,354] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:49,356] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:49,357] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,357] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,357] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,369] [22/MainThread] [INFO] [root] Appending tasks done
[2019-05-17 08:18:49,369] [22/MainThread] [INFO] [root] Batch.__init__ called
[2019-05-17 08:18:49,369] [22/MainThread] [INFO] [root] Submitting to the client
[2019-05-17 08:18:49,369] [22/MainThread] [INFO] [root] Batch.__reduce__ called
[2019-05-17 08:18:49,370] [22/MainThread] [INFO] [root] Added future to tasks
[2019-05-17 08:18:49,371] [22/MainThread] [INFO] [root] Parallel._dispatch: job of id 0 inserted to back-end
[2019-05-17 08:18:49,371] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: leaving the lock after self._dispatch
[2019-05-17 08:18:49,372] [22/Thread-1] [INFO] [root] Scheduler.update_graph called
[2019-05-17 08:18:49,373] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch called
[2019-05-17 08:18:49,373] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:49,373] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: took the lock
[2019-05-17 08:18:49,374] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:49,374] [22/MainThread] [INFO] [root] DaskDistributedBackend.get_nested_backend called
[2019-05-17 08:18:49,374] [22/Thread-1] [INFO] [root] Scheduler.valid_workers called
[2019-05-17 08:18:49,374] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:49,374] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:49,375] [22/Thread-1] [INFO] [root] Scheduler.consumre_resources called
[2019-05-17 08:18:49,375] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,375] [22/MainThread] [INFO] [root] delayed called
[2019-05-17 08:18:49,375] [22/Thread-1] [INFO] [root] Scheduler.send_task_to_worker called
[2019-05-17 08:18:49,376] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:49,377] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,377] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,378] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,378] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,379] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,379] [22/MainThread] [INFO] [root] Parallel.dispatch: calling self._dispatch
[2019-05-17 08:18:49,379] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,380] [22/MainThread] [INFO] [root] Parallel._dispatch called
[2019-05-17 08:18:49,380] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:49,380] [22/MainThread] [INFO] [root] Parallel._dispatch: submitting job of id 1 to back-end
[2019-05-17 08:18:49,381] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:49,382] [22/MainThread] [INFO] [root] DaskDistributedBackend.apply_async called
[2019-05-17 08:18:49,383] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:49,383] [22/MainThread] [INFO] [root] DaskDistributedBackend._to_func_args called
[2019-05-17 08:18:49,383] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:49,383] [22/MainThread] [INFO] [root] in '_dask._to_func_args'
[2019-05-17 08:18:49,384] [22/MainThread] [INFO] [root] Looping on func.items
[2019-05-17 08:18:49,384] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:49,397] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:49,400] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:49,485] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,586] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,687] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,788] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,888] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:49,990] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,090] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,191] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,292] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,392] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,493] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,594] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,694] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,795] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,895] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:50,996] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,097] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,113] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,113] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,113] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,113] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,122] [22/Thread-1] [INFO] [root] Scheduler.add_client called
distributed.scheduler - INFO - Receive client connection: Client-worker-67033d1e-787c-11e9-8016-0a580a0c034a
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:51,165] [22/MainThread] [INFO] [root] maybe to future called
=[2019-05-17 08:18:51,171] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,173] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,176] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,176] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,176] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,176] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,194] [22/MainThread] [INFO] [root] Appending tasks done
[2019-05-17 08:18:51,194] [22/MainThread] [INFO] [root] Batch.__init__ called
[2019-05-17 08:18:51,194] [22/MainThread] [INFO] [root] Submitting to the client
[2019-05-17 08:18:51,194] [22/MainThread] [INFO] [root] Batch.__reduce__ called
[2019-05-17 08:18:51,195] [22/MainThread] [INFO] [root] Added future to tasks
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] Parallel._dispatch: job of id 1 inserted to back-end
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: leaving the lock after self._dispatch
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] Parallel.__call__ dispatched one batch
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch called
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: took the lock
[2019-05-17 08:18:51,196] [22/MainThread] [INFO] [root] DaskDistributedBackend.get_nested_backend called
[2019-05-17 08:18:51,197] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:51,198] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,198] [22/MainThread] [INFO] [root] delayed called
[2019-05-17 08:18:51,199] [22/MainThread] [INFO] [root] Parallel.dispatch: calling self._dispatch
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] Parallel._dispatch called
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] Parallel._dispatch: submitting job of id 2 to back-end
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] DaskDistributedBackend.apply_async called
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] DaskDistributedBackend._to_func_args called
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] in '_dask._to_func_args'
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] Looping on func.items
[2019-05-17 08:18:51,200] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:51,207] [22/Thread-1] [INFO] [root] Scheduler.update_graph called
[2019-05-17 08:18:51,207] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,208] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,208] [22/Thread-1] [INFO] [root] Scheduler.valid_workers called
[2019-05-17 08:18:51,208] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,208] [22/Thread-1] [INFO] [root] Scheduler.consumre_resources called
[2019-05-17 08:18:51,208] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,209] [22/Thread-1] [INFO] [root] Scheduler.send_task_to_worker called
[2019-05-17 08:18:51,209] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,209] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,208] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,209] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,210] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,211] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,211] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,211] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,211] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,211] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,213] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,216] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,217] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,219] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,219] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,219] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,219] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,224] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:51,230] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,232] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,237] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,238] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,238] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,238] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,247] [22/MainThread] [INFO] [root] Appending tasks done
[2019-05-17 08:18:51,247] [22/MainThread] [INFO] [root] Batch.__init__ called
[2019-05-17 08:18:51,247] [22/MainThread] [INFO] [root] Submitting to the client
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Batch.__reduce__ called
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Added future to tasks
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Parallel._dispatch: job of id 2 inserted to back-end
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: leaving the lock after self._dispatch
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Parallel.__call__ dispatched one batch
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch called
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: took the lock
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] DaskDistributedBackend.get_nested_backend called
[2019-05-17 08:18:51,248] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:51,249] [22/MainThread] [INFO] [root] delayed called
[2019-05-17 08:18:51,250] [22/MainThread] [INFO] [root] Parallel.dispatch: calling self._dispatch
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] Parallel._dispatch called
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] Parallel._dispatch: submitting job of id 3 to back-end
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] DaskDistributedBackend.apply_async called
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] DaskDistributedBackend._to_func_args called
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] in '_dask._to_func_args'
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] Looping on func.items
[2019-05-17 08:18:51,251] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:51,253] [22/Thread-1] [INFO] [root] Scheduler.update_graph called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.valid_workers called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.consumre_resources called
[2019-05-17 08:18:51,254] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.send_task_to_worker called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,255] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,264] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,265] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,277] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,278] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,278] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,278] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,282] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:51,290] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,292] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,295] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:51,295] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,295] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,296] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,301] [22/MainThread] [INFO] [root] Appending tasks done
[2019-05-17 08:18:51,301] [22/MainThread] [INFO] [root] Batch.__init__ called
[2019-05-17 08:18:51,301] [22/MainThread] [INFO] [root] Submitting to the client
[2019-05-17 08:18:51,301] [22/MainThread] [INFO] [root] Batch.__reduce__ called
[2019-05-17 08:18:51,302] [22/Thread-1] [INFO] [root] Scheduler.update_graph called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.valid_workers called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.decide_worker called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.worker_objective called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.worker_objective called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.consumre_resources called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.send_task_to_worker called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,303] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,304] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,304] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,304] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,304] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,304] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,305] [22/MainThread] [INFO] [root] Added future to tasks
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] Parallel._dispatch: job of id 3 inserted to back-end
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: leaving the lock after self._dispatch
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] Parallel.__call__ dispatched one batch
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch called
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] Parallel.dispatch_one_batch: took the lock
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] DaskDistributedBackend.get_nested_backend called
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] DaskDistributedBackend.__init__ called
[2019-05-17 08:18:51,306] [22/MainThread] [INFO] [root] delayed called
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] Parallel.dispatch: calling self._dispatch
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] Parallel._dispatch called
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] Parallel._dispatch: submitting job of id 4 to back-end
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] DaskDistributedBackend.apply_async called
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] DaskDistributedBackend._to_func_args called
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] in '_dask._to_func_args'
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] Looping on func.items
[2019-05-17 08:18:51,308] [22/MainThread] [INFO] [root] maybe to future called
[2019-05-17 08:18:51,313] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,317] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,319] [22/IO loop] [INFO] [root] Worker.get_worker called
[2019-05-17 08:18:51,321] [22/Thread-1] [INFO] [root] Scheduler.scatter called
[2019-05-17 08:18:51,399] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,399] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler.handle_task_finished called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler._remove_from_processing called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler.release_resources called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler._add_to_memory called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,455] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,457] [22/IO loop] [INFO] [root] Parallel.print_progress called
[2019-05-17 08:18:51,457] [22/IO loop] [INFO] [root] Parallel._print called
[2019-05-17 08:18:51,500] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,500] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,595] [22/Thread-1] [INFO] [root] Scheduler.handle_task_finished called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler._remove_from_processing called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.release_resources called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler._add_to_memory called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,596] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,597] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,597] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,597] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,597] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,597] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,600] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,601] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,695] [22/Thread-1] [INFO] [root] Scheduler.handle_task_finished called
[2019-05-17 08:18:51,695] [22/Thread-1] [INFO] [root] Scheduler._remove_from_processing called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler.release_resources called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler._add_to_memory called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:51,696] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:51,697] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:51,701] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,802] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:51,902] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,003] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,104] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,204] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,305] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,407] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,508] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,608] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,709] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,810] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:52,910] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:53,011] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:53,061] [22/Thread-1] [INFO] [root] Scheduler.update_data called
[2019-05-17 08:18:53,062] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,062] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:53,062] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,076] [22/Thread-1] [INFO] [root] Scheduler.add_keys called
[2019-05-17 08:18:53,076] [22/Thread-1] [INFO] [root] Scheduler.add_keys called
[2019-05-17 08:18:53,076] [22/Thread-1] [INFO] [root] Scheduler.add_keys called
[2019-05-17 08:18:53,089] [22/Thread-1] [INFO] [root] Scheduler.add_client called
distributed.scheduler - INFO - Receive client connection: Client-worker-6830848a-787c-11e9-8016-0a580a0c034b
distributed.core - INFO - Starting established connection
[2019-05-17 08:18:53,111] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:53,169] [22/Thread-1] [INFO] [root] Scheduler.handle_task_finished called
[2019-05-17 08:18:53,169] [22/Thread-1] [INFO] [root] Scheduler._remove_from_processing called
[2019-05-17 08:18:53,169] [22/Thread-1] [INFO] [root] Scheduler.check_idle_saturated called
[2019-05-17 08:18:53,170] [22/Thread-1] [INFO] [root] Scheduler.release_resources called
[2019-05-17 08:18:53,170] [22/Thread-1] [INFO] [root] Scheduler._add_to_memory called
[2019-05-17 08:18:53,170] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,170] [22/Thread-1] [INFO] [root] Scheduler.transitions called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.worker_send called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report_on_key called
[2019-05-17 08:18:53,171] [22/Thread-1] [INFO] [root] Scheduler.report called
# continuing indefinitely

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (15 by maintainers)

github_iconTop GitHub Comments

1reaction
ogriselcommented, May 29, 2019

The dispatch functions can be called by the main thread in Parallel.__call__ but also by the async callback threads when tasks complete. Therefore we need a lock to avoid concurrent access to the parallel backend state by both threads when the first tasks complete before the main thread has finished dispatching.

1reaction
mrocklincommented, May 22, 2019
Read more comments on GitHub >

github_iconTop Results From Across the Web

Embarrassingly parallel Workloads - Dask Examples
This notebook shows how to use Dask to parallelize embarrassingly parallel workloads where you want to apply one function to many pieces of...
Read more >
Speeding up your Algorithms Part 4— Dask | by Puneet Grover
Run your Pandas/Numpy/Sklearn/Python code in parallel with Dask · Speed Up your Algorithms Part 1 — PyTorch · Speed Up your Algorithms Part...
Read more >
Dask For Loop In Parallel - Stack Overflow
You will want to use the multiprocessing or dask.distributed Dask backends rather than the default threading backend.
Read more >
Parallel computing with Dask - Xarray
Note that writing netCDF files with Dask's distributed scheduler is only supported for the netcdf4 backend. A dataset can also be converted to...
Read more >
Distributed Computing with dask - Practical Data Science
But your computer would probably crash. The biggest challenge in using a tool like dask or spark (they operate on the same principles...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found