Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LocalCluster CPU usage climbs to 100% over time without submitting computations, connecting workers

See original GitHub issue

I noticed when I was working locally in a Jupyter Lab session on my laptop that computations would gradually slow down, down, down, until there was nothing to do but restart the notebook kernel (and therefore scheduler/workers). I started to pare down a minimal reproducing example, and got all the way down to launching a LocalCluster with zero workers. After five minutes, CPU use is hovering around 100%, and a few minutes later I start getting the message distributed.utils_perf - WARNING - full garbage collections took 19% CPU time recently repeated every ~1s. This is all without connecting a worker or trying to do any computation.

What happened: Requesting a LocalCluster results in a process that grows to use 100% of the CPU even when no workers are launched and no computations are requested.

What you expected to happen: CPU use should remain roughly static when not under load

Minimal Complete Verifiable Example:

from dask.distributed import Client, LocalCluster
cluster = LocalCluster(n_workers=0)
client = Client(cluster.scheduler.address)
# wait 5 minutes and check CPU use

Screen Shot 2021-05-17 at 5 47 15 PM Screen Shot 2021-05-17 at 5 51 07 PM Screen Shot 2021-05-17 at 5 52 10 PM

Anything else we need to know?:

I also tried setting the loop implementation to uvloop in ~/.config/dask/dask.yaml and the issue persisted. Above screenshots are from running with the default config except for:

logging:
  distributed: debug
  distributed.client: warning
  distributed.scheduler: debug
  bokeh: error

Environment:

Dask version: 2021.5.0
Python version: 3.9.4
Operating System: macOS Big Sur 11.3.1
Install method (conda, pip, source): conda

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:13 (3 by maintainers)

Top GitHub Comments

2reactions

joseph-longcommented, Jun 10, 2021

Tried the new release and it worked for me, but forgot to comment. Thanks!

1reaction

jrbourbeaucommented, May 18, 2021

Thanks for reporting this issue @joseph-long! I tried this out locally and observed similar behavior.

cc @ian-r-rose due to the Jupyter connection

Top Results From Across the Web

High CPU usage kernel_task with 2 monitors

Hello guys! Anyone out there with high CPU usage issues by kernel_task when you have two monitors connected by Thunderbolt? CPU usage at...

spread your data and computation across a cluster

Creating a cluster object will create a Dask scheduler and a number of Dask workers. If no arguments are specified then it will...

How to Fix High CPU Usage

Open the Task Manager (CTRL+SHIFT+ESCAPE). If a program has started climbing in CPU use again even after a restart, Task Manager provides one...

Limit Dask CPU and Memory Usage (Single Node)

Dask.distributed.Client creates a LocalCluster for which you can explicitly set the memory use and the number of cores.

Can I run my CPU at 100% usage for a long time?

A lot of folks do run their computers continuously at ~100%. This is common practice in computational work, e.g. with engineering ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

LocalCluster CPU usage climbs to 100% over time without submitting computations, connecting workers

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

How to start multiple dask workers with 1 GPU each?

map_partitions unexpected behavior