Nanny error: Worker process was killed by unknown signal
See original GitHub issuedistributed.nanny - WARNING - Worker process 13375 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13377 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13372 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13383 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13373 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13384 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Worker process 13380 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker
Happens without fail when using read_parquet
with fastparquet
can be avoided with pyarrow
but still happens x% of the time. (x depends on how you setup n_workers, n_clients, memory_limit in client but would say is always greater than 25%).
My machine runs Fedora 27 and I was able to work around the problem by setting multiprocessing-method
to spawn
thanks to help from @mrocklin.
(In debugging this with @mrocklin we were never able to get more information out about what the root cause was).
Issue Analytics
- State:
- Created 6 years ago
- Comments:15 (8 by maintainers)
Top Results From Across the Web
Why did my worker die?
The worker process may stop working without notice. This can happen due to something internal to the worker, e.g., a memory violation (common...
Read more >Dask Dataframe Distributed Process ID Access Denied
nanny - WARNING - Worker process 18843 was killed by unknown signal. I'll play around some more, maybe something on my machine is...
Read more >distributed.nanny — Dask.distributed 2.11.0 documentation
[docs]class Nanny(ServerNode): """ A process to manage worker processes The ... if exitcode == 255: return "Worker process %d was killed by unknown...
Read more >[DM-13645] Figure out how to start a dask cluster at lsst-dev using ...
Fatal error in PMPI_Init_thread: Other MPI error, error stack: ... distributed.nanny - WARNING - Worker process 178401 was killed by unknown signal.
Read more >1006786 - Harassment and signal squelchening Description
My phone now is starting to experience static when DSL is not working properly. ... with unfamiliar cable boxes, one with a modular...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Another (self contained, though less minimal) example:
So far so good. Then
df2.mean().compute()
results in traceback:Similar code snippets which execute as expected:
df['a_1']=(df['a_1']*10000).astype(int)
.np.random.rand(3_000_000,20)
tonp.random.rand(2_000_000,20)
This issue would benefit from a minimum reproducible example.