question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Opportunistic Caching

See original GitHub issue

Currently we clean up intermediate results quickly if they are not necessary for any further pending computation. This is good because it minimizes the memory footprint on the workers, often allowing us to process larger-than-distributed-memory computations.

However, this can sometimes be inefficient for interactive workloads when users submit related computations one after the other, so that the scheduler has no opportunity to plan ahead, and instead needs to recompute an intermediate result that was previously computed and garbage collected.

We could hold on to some of these results in hopes that the user will request them again. This trades active memory for potential CPU time. Ideally we would hold onto results that:

  1. Have a small memory footprint
  2. Take a long time to compute
  3. Are likely to be requested again (evidenced by recent behavior)

We did this for the single machine scheduler

We could do it in the distributed scheduler fairly easily by creating a SchedulerPlugin that watched all computations, selected computations to keep based on logic similar to what is currently in cachey, and created a fake Client to keep an active reference to those keys in the scheduler.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:3
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
TomAugspurgercommented, Jun 13, 2019

Scheduler plugins are at https://distributed.dask.org/en/latest/plugins.html and the Scheduler API is at https://distributed.dask.org/en/latest/scheduling-state.html#distributed.scheduler.Scheduler

On Thu, Jun 13, 2019 at 12:11 PM IPetrik notifications@github.com wrote:

To be explicit, the mechanism to keep data on the cluster might look like this:

class CachingPlugin(SchedulerPlugin): def init(self, scheduler): self.scheduler = scheduler self.scheduler.add_plugin(self)

def transition(self, key, start, finish, nbytes=None, startstops=None, *args, **kwrags):
    if start == 'processing' and finish == 'memory' and should_keep(nbytes, startstops, **kwargs):
        self.scheduler.client_desires_keys(keys=[key], client='fake-caching-client')
    no_longer_desired_keys = self.cleanup()
    self.scheduler.client_releases_keys(keys=no_longer_desired_keys, client='fake-caching-client')

client.run_on_scheduler(lambda dask_scheduler: CachingPlugin(dask_scheduler)

@mrocklin https://github.com/mrocklin is the scheduler API explained somewhere? Can you provide more explanation of how this works? What do client_desires_keys and client_releases_keys do?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/681?email_source=notifications&email_token=AAKAOISH2A65RLTVGQOYDYLP2J5VDA5CNFSM4CWPLB6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXUMQQQ#issuecomment-501794882, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKAOIW2G7Z4QK2QTXCVPZDP2J5VDANCNFSM4CWPLB6A .

1reaction
mrocklincommented, Jun 8, 2017

To be explicit, the mechanism to keep data on the cluster might look like this:

class CachingPlugin(SchedulerPlugin):
    def __init__(self, scheduler):
        self.scheduler = scheduler
        self.scheduler.add_plugin(self)

    def transition(self, key, start, finish, nbytes=None, startstops=None, *args, **kwrags):
        if start == 'processing' and finish == 'memory' and should_keep(nbytes, startstops, **kwargs):
            self.scheduler.client_desires_keys(keys=[key], client='fake-caching-client')
        no_longer_desired_keys = self.cleanup()
        self.scheduler.client_releases_keys(keys=no_longer_desired_keys, client='fake-caching-client')

client.run_on_scheduler(lambda dask_scheduler: CachingPlugin(dask_scheduler)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Opportunistic Caching - Dask documentation
This document explains an experimental, opportunistic caching mechanism that automatically picks out and stores useful tasks.
Read more >
OFC: an opportunistic caching system ... - Archive ouverte HAL
In this paper we present OFC (Opportunistic FaaS Cache), an opportunistic RAM-based caching system to improve func- tion execution time by ...
Read more >
Opportunistic Caching in NoC: Exploring Ways to Reduce ...
Abstract: Due to limited on-chip caching, data-driven applications with large memory footprint encounter frequent cache misses.
Read more >
Caching Strategies in Opportunistic Networks - DiVA Portal
Abstract—In this paper we examine content distribution in opportunistic networks. We design and evaluate strategies by.
Read more >
Understanding optimal caching and ... - eScholarship.org
The framework is used to compare the performance of optimal caching everywhere in an ICN with opportunistic caching of content only near its...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found