question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LocalCluster not respecting dask distributed config

See original GitHub issue

Running dask.config.set to update the worker memory targets does not seem to affect the worker limits in Cluster, and the config is not respected (despite being represented in the print)

import dask
from dask.datasets import timeseries
from distributed import LocalCluster, Client
dask.config.set({'distributed.worker.memory.target': False, 'distributed.worker.memory.spill': False})
print(dask.config.config['distributed']['worker']['memory'])
cluster = LocalCluster(n_workers=4, threads_per_worker=2, memory_limit='1G')
client = Client(cluster)
dfs = []
for i in range(60):
    dfs.append(client.persist(timeseries()))

I would have expected the code above to cause the workers to die and have some errors about memory, but I’m getting nothing. And subsequent calls, eg.

for x in dfs:
    print(x.size.compute())

Now take lots of time (data didn’t fit in to memory, and is being loaded from disk).

How can I, in python, spin up a LocalCluster with no-spill settings?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
bluecoconutcommented, Apr 8, 2019

Awesome! thank you~

0reactions
mrocklincommented, Apr 27, 2019

@zyxue I recommend raising a new issue with your situation

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot configure local cluster dask worker directory
I use distributed version 2021.02.0 and i can set the worker directory this way cluster = LocalCluster(name="New Cluster", ...
Read more >
API — Dask.distributed 2022.12.1 documentation
Registers a setup callback function for all current and future workers. ... If you do not pass a scheduler address, Client will create...
Read more >
Just Start with the Dask LocalCluster | Saturn Cloud Blog
You can get lots of value from Dask without even using a distributed cluster. Try using the LocalCluster instead!
Read more >
API — dask-cuda 22.12.00a0+g2c99f5a documentation
This size is a per-worker configuration, and not cluster-wide. ... from dask_cuda import LocalCUDACluster >>> from dask.distributed import Client ...
Read more >
Access dashboard for LocalCluster on pangeo deployments
Julius, you should be able to test this yourself by setting that config value. In [13]: import dask, distributed In [14]: dask.config.set(**{"distributed.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found