Await on handle_stream raises missing delete_data await warning
See original GitHub issueFor increased visibility, I’m reposting https://github.com/dask/distributed/pull/3847/files#r443766556 as an issue here:
We have a few tests in dask-cuda that check the behavior of Device<->Host<->Disk spilling and I noticed after the 2.19 release one of them has broken, I managed to track it down to one specific line of code in https://github.com/dask/distributed/blob/44b2358e33a0738c4c70ca96db4242636245e07d/distributed/core.py#L573, introduced by https://github.com/dask/distributed/pull/3847. The test in question happens in https://github.com/rapidsai/dask-cuda/blob/branch-0.15/dask_cuda/tests/test_spill.py#L409-L411, where we assert that the zict dictionaries are empty after deleting cdf2
, which is the object being spilled. It seems that this is because we’re not awaiting for Worker.delete_data
somewhere, as per the warning below that doesn’t happen if I comment await gen.sleep(0)
out:
dask_cuda/tests/test_spill.py::test_cudf_device_spill[params0]
/datasets/pentschev/miniconda3/envs/r-102-0.14/lib/python3.7/inspect.py:732: RuntimeWarning: coroutine 'Worker.delete_data' was never awaited
for modname, module in list(sys.modules.items()):
I think that the only place where Worker.delete_data
would be called and should be awaited is in https://github.com/dask/distributed/blob/4f878b420b349ee725de5ef64fd5e664dedb8aba/distributed/scheduler.py#L2791-L2800, but I don’t have anything better than my guess at this time because it’s really hard for me to understand all the async black magic. I’m gonna continue trying to figure this out, but any suggestions on how to pinpoint that are appreciated!
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (6 by maintainers)
Top GitHub Comments
I was able to write a test where we can reproduce the issue independent of GPUs and dask-cuda, therefore I opened #3922 with the fix suggested by @jakirkham and a test for that.
Thanks @jakirkham for looking at that, I actually verified that applying your suggestion things work again:
Possibly the second part can be removed/has to be fixed, as the comment above it suggests. There’s no
remove_keys
anywhere in this repository.Happy to file a PR if this change is reasonable.