Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GPU-friendly loads / merge_frames

See original GitHub issue

Just throwing this up here for now, need to investigate more.

I’m working on a distributed cudf join using UCX. Things progress fine until, in the Client process, we attempt to deserialize some data (I think the final result?). We end up calling calling loads with deserialize=True: https://github.com/dask/distributed/blob/fb30c33562862f30864456766424b44a3e91aa5b/distributed/protocol/core.py#L101

which calls merge_frames: https://github.com/dask/distributed/blob/fb30c33562862f30864456766424b44a3e91aa5b/distributed/protocol/utils.py#L80

which attempt to convert the data to a byte string.

At this point in the client process, frames is a list of objects representing device memory. If possible (and I think it’s possible), I’d like to avoid copying to the host here.

Actually, this may only be possible if the Client happens to have a GPU as well. In this case that’s true, but not in general.

TODO:

figure out exactly where the client is calling this
…

Issue Analytics

State:
Created 5 years ago
Comments:14 (14 by maintainers)

Top GitHub Comments

1reaction

TomAugspurgercommented, Apr 28, 2020

I think we’ll close this and reopen if we come across it with the current implementations.

0reactions

jakirkhamcommented, Apr 28, 2020

To the spirit of the issue, we do have cuda_dumps and cuda_loads. Thus far merge_frames doesn’t behave how we would expect ( https://github.com/dask/distributed/issues/3580 ) so we mostly avoid it. Though PR ( https://github.com/dask/distributed/pull/3732 ) has merge and split frame style functions. So maybe that solves that piece of this issue?