question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] A new approach to memory spilling

See original GitHub issue

Question: would the Dask/Distributed community be interested in an improved memory spilling model that fixes the shortcomings of the current one but make use of proxy object wrappers?

In Dask-CUDA we have introduced a new approach to memory spilling that handles object aliasing and JIT memory un-spilling: https://github.com/rapidsai/dask-cuda/pull/451

The result is memory spilling that:

The current implement in Dask-CUDA handles CUDA device objects but it is possible to generalize to also handle spilling to disk.

The disadvantage of this approach is the use of proxy objects that get exposed to the users. The inputs to a tasks might be wrapped in a proxy object, which doesn’t mimic the proxied object perfectly. E.g.:

    # Type checking using instance() works as expected but direct type checking doesn't:
    >>> import numpy as np
    >>> from dask_cuda.proxy_object import asproxy
    >>> x = np.arange(3)
    >>> isinstance(asproxy(x), type(x))
    True
    >>>  type(asproxy(x)) is type(x)
    False

Because of this, the approach shouldn’t be enabled by default but do you think that the Dask community would be interested in a generalization of this approach? Or is the proxy object hurdle too much of an issue?

cc. @mrocklin, @jrbourbeau, @quasiben

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:20 (19 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, Mar 23, 2021

Right, breaking the MutableMapping abstraction will make it fairly easy to fix the double counting issue (#4186) and avoid memory spikes because of incorrect memory tally

This may also provide other benefits, like allowing for async reading/writing, and improved worker scheduling of tasks that is sensitive to data that is already in fast memory, or pre-fetching data from slow memory.

0reactions
mrocklincommented, Mar 25, 2021

If this is very effective for shuffle workloads then maybe it’s something that we could implement just for that code path? That might be a tightly scoped place to try this out more broadly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is Spilling - GeeksforGeeks
Formally speaking, spilling is a technique in which, a variable is moved out from a register space to the main memory(the RAM) to...
Read more >
Spark Performance Optimization Series: #2. Spill - Medium
Spill is represented by two values: (These two values are always presented together.) Spill (Memory): is the size of the data as it...
Read more >
Hash join spills to disk even though there is plenty of memory ...
It shows the input side of the join gets 52.48%, but not the output side. If you have more memory required for the...
Read more >
Why shuffle Spill (Memory) is more than spark driver/executor ...
Spilling of data happens when an executor runs out of its memory. And shuffle spill (memory) is the size of the deserialized form...
Read more >
How to remove spills - Microsoft Q&A
so please tell me in how many ways we can remove these spills ... So the question is why he query was not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found