Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Extend memory spilling to multiple storage media

See original GitHub issue

Currently in the works of #35, we will have the capability of spilling CUDA device memory to host, and that to disk. However, as pointed out by @kkraus14 here, it would be beneficial to allow spilling host memory to multiple user-defined storage media.

I think we could follow the same configuration structure of Alluxio, as suggested by @kkraus14. Based on the current structure suggested in #35 (still subject to change), it would look something like the following:

cuda.worker.dirs.path=/mnt/nvme,/mnt/ssd,/mnt/nfs cuda.worker.dirs.quota=16GB,100GB,1000GB

@mrocklin FYI

Issue Analytics

State:
Created 4 years ago
Comments:11 (4 by maintainers)

Top GitHub Comments

3reactions

jakirkhamcommented, Nov 28, 2019

One related note for tracking, it would be useful to leverage GPUDirect Storage to allow spilling directly from GPU memory to disk.

2reactions

pentschevcommented, May 27, 2021

No, managed memory is handled by the CUDA driver, we have no control over how it handles spilling and it doesn’t support any spilling to disk whatsoever. Within Dask, you can enable spilling as I mentioned above, it doesn’t make use of managed memory and thus is not as performant, but it will allow Dask to spill Python memory (i.e., Dask array/dataframes chunks), but it also has no control over the memory that’s handled internally by libraries such as cuDF.