question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adaptive.needs_cpu does not depend on number of tasks remaining

See original GitHub issue

Issue description

We’re using distributed (with KubeCluster) with client.map to schedule a lot of long-running tasks (right now we’re running a Fortran-based hydrological model).

We noticed that clusters don’t scale down when the number of tasks remaining falls below the number of workers until all tasks have completed.

I isolated the problem to Adaptive.needs_cpu(). The current method does not check whether there are any pending tasks on the scheduler:

    def needs_cpu(self):
        """
        Check if the cluster is CPU constrained (too many tasks per core)
        Notes
        -----
        Returns ``True`` if the occupancy per core is some factor larger
        than ``startup_cost``.
        """
        total_occupancy = self.scheduler.total_occupancy
        total_cores = sum([ws.ncores for ws in self.scheduler.workers.values()])

        if total_occupancy / (total_cores + 1e-9) > self.startup_cost * 2:
            logger.info("CPU limit exceeded [%d occupancy / %d cores]",
                        total_occupancy, total_cores)
            return True
        else:
            return False

This results in adapt.recommendations() returning the error message Trying to scale up and down simultaneously whenever there are fewer pending tasks than there are workers, as long as the average task time suggests that more cores are needed (independent of the number of pending tasks).

Proposed solution

I implemented a quick fix, by finding the total number of pending tasks and only recommending a “scale up” if the number of tasks exceeds the number of existing workers, in addition to the current criteria:

    def needs_cpu(self):
        """
        Check if the cluster is CPU constrained (too many tasks per core)
        Notes
        -----
        Returns ``True`` if the occupancy per core is some factor larger
        than ``startup_cost``.
        """
        total_occupancy = self.scheduler.total_occupancy
        total_cores = sum([ws.ncores for ws in self.scheduler.workers.values()])

        if total_occupancy / (total_cores + 1e-9) > self.startup_cost * 2:
            logger.info("CPU limit exceeded [%d occupancy / %d cores]",
                        total_occupancy, total_cores)

            tasks_processing = sum((len(w.processing) for w in self.scheduler.workers.values()))
            num_workers = len(self.scheduler.workers)

            if tasks_processing > num_workers:
                logger.info("pending tasks exceed number of workers [%d tasks / %d workers]",
                            tasks_processing, num_workers)
                return True

        return False

Pros

  • Exhibits the desired behavior (we’re using this fix now by subclassing KubeCluster)

Cons

  • May be a limited use case
  • Increases overhead of needs_cpu. I tested this out on limited cases with between 800 - 100,000 tasks and found the current implementation usually takes ~ 30-40 µs, and the proposed implementation roughly doubles this. There may be faster ways of doing this, but I imagine this may be a critical problem with this implementation, so help would be appreciated in estimating tasks remaining more quickly!

Testable example

Requires some interactivity, but reliably re-produces the problem

In [1]: import dask.distributed as dd

In [2]: cluster = dd.LocalCluster()

In [3]: adaptive = cluster.adapt(minimum=0, maximum=10)

In [5]: adaptive
Out[5]: <distributed.deploy.adaptive.Adaptive at 0x1153b3668>

In [6]: def wait_a_while(i):
   ...:     import time
   ...:     import random
   ...:     s = (random.random()) ** 6 * 60
   ...:     time.sleep(s)
   ...:
   ...:     return s

In [8]: client = dd.Client(cluster)

In [9]: f = client.map(wait_a_while, range(10))

In [10]: # wait for most futures to finish

In [17]: f
Out[17]:
[<Future: status: finished, type: float, key: wait_a_while-fdc644303e9be2c85edd9201261409af>,
 <Future: status: finished, type: float, key: wait_a_while-97098da3920c7582be062b54ee78efe1>,
 <Future: status: finished, type: float, key: wait_a_while-630e0e1fb8a0f8ede1140368de97ffce>,
 <Future: status: pending, key: wait_a_while-09f09368b6e9555668ab3f82efad91dd>,
 <Future: status: finished, type: float, key: wait_a_while-65d1c81d072269ab477d806d017302e2>,
 <Future: status: finished, type: float, key: wait_a_while-ca96a3b8db585962fc8638066458a815>,
 <Future: status: finished, type: float, key: wait_a_while-0a13c1a4f503a08e1edaf79dba3c94c5>,
 <Future: status: finished, type: float, key: wait_a_while-549f788086c75f350390b4a6131ae6cb>,
 <Future: status: pending, key: wait_a_while-17133623fc213adcb83f3b45e53839c9>,
 <Future: status: pending, key: wait_a_while-41e284f91b0a2bb1c3a33394e51c97fc>]

In [18]: cluster._adaptive.recommendations()
Out[18]: {'status': 'error', 'msg': 'Trying to scale up and down simultaneously'}

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, Mar 19, 2020

I personally don’t know. If someone wants to look though I would recommend starting here:

https://github.com/dask/distributed/blob/2acffc3172ec32e173547ee4c39a01b6c94e74a1/distributed/scheduler.py#L5209-L5260

0reactions
jrbourbeaucommented, May 1, 2020

Indeed, thank you for following up here @guillaumeeb!

Read more comments on GitHub >

github_iconTop Results From Across the Web

What Is Accelerated Computing and Why Is It Important? - Xilinx
Adaptive computing is the only type of accelerated computing where the hardware is not permanently fixed during manufacturing. Instead, adaptive computing ...
Read more >
Enabling adaptive scheduling - IBM
With adaptive scheduling, when your cluster is short on GPU resources, CPU resources can help speed up large-scale machine-learning applications and improve ...
Read more >
Capacity Miss - an overview | ScienceDirect Topics
Due to the first assumption, when a task executes on a processor, if not preempted by other tasks, the only cache misses are...
Read more >
Brain: an Adaptive Computer? - Department of Computer Science
The goal of this paper is to describe ways to create systems that would use their experience of past executions of algorithms in...
Read more >
Cognitive state monitoring and the design of adaptive ... - NCBI
Workload adaptation is a topic that is not only relevant for specialized ... the perceived task demands of learning materials in a way...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found