question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Extend memory spilling to multiple storage media

See original GitHub issue

Currently in the works of #35, we will have the capability of spilling CUDA device memory to host, and that to disk. However, as pointed out by @kkraus14 here, it would be beneficial to allow spilling host memory to multiple user-defined storage media.

I think we could follow the same configuration structure of Alluxio, as suggested by @kkraus14. Based on the current structure suggested in #35 (still subject to change), it would look something like the following:

cuda.worker.dirs.path=/mnt/nvme,/mnt/ssd,/mnt/nfs cuda.worker.dirs.quota=16GB,100GB,1000GB

@mrocklin FYI

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
jakirkhamcommented, Nov 28, 2019

One related note for tracking, it would be useful to leverage GPUDirect Storage to allow spilling directly from GPU memory to disk.

2reactions
pentschevcommented, May 27, 2021

No, managed memory is handled by the CUDA driver, we have no control over how it handles spilling and it doesn’t support any spilling to disk whatsoever. Within Dask, you can enable spilling as I mentioned above, it doesn’t make use of managed memory and thus is not as performant, but it will allow Dask to spill Python memory (i.e., Dask array/dataframes chunks), but it also has no control over the memory that’s handled internally by libraries such as cuDF.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Object Spilling — Ray 2.2.0 - the Ray documentation
Ray 1.3+ spills objects to external storage once the object store is full. By default, objects are spilled to Ray's temporary directory in...
Read more >
[Question] A new approach to memory spilling #4568 - GitHub
First step to enable partial spilled objects such as the spilling of individual columns in a data frame: Ability to have output dask_cudf....
Read more >
Spill to Disk — Presto 0.278 Documentation
When a query approaches the memory limit, a subset of the partitions of the build table gets spilled to disk, along with rows...
Read more >
Spilling properties — Starburst Enterprise
Spilling works by offloading memory to disk. This process can allow a query with a large memory footprint to pass at the cost...
Read more >
Spark Performance Tuning: Spill - SelectFrom
Whenever total data size exceeds usable memory size (sum up from both execution task and storage task), this is when the spill occurs!...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found