question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError in `Worker.handle_compute_task` (causes deadlock)

See original GitHub issue
Traceback (most recent call last):
  File "/opt/conda/envs/coiled/lib/python3.9/site-packages/distributed/worker.py", line 1237, in handle_scheduler
    await self.handle_stream(
  File "/opt/conda/envs/coiled/lib/python3.9/site-packages/distributed/core.py", line 564, in handle_stream
    handler(**merge(extra, msg))
  File "/opt/conda/envs/coiled/lib/python3.9/site-packages/distributed/worker.py", line 1937, in handle_compute_task
    self.tasks[key].nbytes = value
KeyError: "('slices-40d6794b777d639e9440f8f518224cfd', 2, 1)"
distributed.worker - INFO - Connection to scheduler broken.  Reconnecting...

I may be able to reproduce this if necessary. I was running a stackstac example notebook on binder against a Coiled cluster over wss, where the particular versions of things were causing a lot of errors (unrelated to dask). So I was frequently rerunning the same tasks, cancelling them, restarting the client, rerunning, etc. Perhaps this cancelling, restarting, rerunning is related?

@fjetter says

The only reason this keyerror can appear is if the compute instruction transitions its own dependency into a forgotten state which is very, very wrong.

Relevant code, ending at the line where the error occurs: https://github.com/dask/distributed/blob/11c41b525267cc144caa8d077b84a0939fecae97/distributed/worker.py#L1852-L1937

Scheduler code producing the message which causes this error: https://github.com/dask/distributed/blob/11c41b525267cc144caa8d077b84a0939fecae97/distributed/scheduler.py#L7953-L7994

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
bennnymcommented, Nov 3, 2021

I was just installing dask[distributed] so I assume the latest version

1reaction
bennnymcommented, Nov 3, 2021

@gjoseph92 my code was running in a notebook and my kernell died, sorry. That is all I have

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Fix KeyError Exceptions in Python - Rollbar
The Python KeyError is an exception that occurs when an attempt is made to access an item in a dictionary that does not...
Read more >
MySQL duplicate key error causes a shared lock set on the ...
If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result...
Read more >
Python KeyError Exceptions and How to Handle Them
In this tutorial, you'll learn how to handle Python KeyError exceptions. They are often caused by a bad key lookup in a dictionary,...
Read more >
How to fix Python KeyError Exceptions in simple steps?
Know about Python KeyError Exception. And learn how to handle exceptions in Python. A detailed guide to Errors and Exceptions in Python.
Read more >
Why does insert on duplicate-key error create a share lock?
I should always check for errors (such as "deadlock") and be prepared to rerun my transaction. InnoDB is very good at discovering deadlocks...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found