Dask doesn't play well with interrupts
See original GitHub issueThis is on Python 3.5.2
, dask 0.10.1
, and jupyter-notebook 4.2.1
, all installed via conda on a 64-bit Ubuntu machine.
If you run the following piece of code:
import json
from dask.diagnostics import ProgressBar
import dask.bag as db
j = json.dumps({"a": 1, "b": 1})
for i in range(8):
data = [j for _ in range(10 ** i)]
bag = db.from_sequence(data, npartitions=4).map(json.loads)
with ProgressBar():
db.zip(bag.pluck("a"), bag.pluck('b')).count().compute()
And try to do a keyboard interrupt in the middle, sometimes dask will flip out and just refuse to exit (no matter how many CTRL-C
s and CTRL-D
s you press. See https://asciinema.org/a/2yqxbhn1patwenwy024m2316l for a video.
This is especially annoying on a Jupyter notebook. Sometimes, dask will mysteriously keep printing out the progress bar from the last session, even after restarting the kernel (!).
EDIT: When this happens, dask will also leave around python
processes still running. See https://asciinema.org/a/8x1liiyuiw7910mugj9tscher for an example (particularly the extra python
process at the end).
Issue Analytics
- State:
- Created 7 years ago
- Comments:8 (3 by maintainers)
Top GitHub Comments
Possibly resolved in #1444
Something like this still occurs for me with the distributed scheduler running on a
LocalCluster
. When I work in a jupyter notebook, and the notebook is executing a blocking.compute()
method, and then hitting CTRL+C - the blocking method indeed returns, but all the worker processes are killed (not very gracefully). The end result is that there are zombie python processes and the scheduler reports 0 workers.So far I haven’t found a good way to recover from this without restarting the jupyter kernel, because issuing
Client.restart()
doesn’t restart the workers, and trying to executeClient(LocalCluster())
again givesport 8787 is already in use
.If anyone has a good tip on how to workaround that could really help.