question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Distributed LocalCluster's `memory_limit` keyword argument needs documentation

See original GitHub issue

Minimal Complete Verifiable Example:

from dask.distributed import Client, LocalCluster
cluster = LocalCluster(n_workers=2,
                       threads_per_worker=4,
                       memory_target_fraction=0.95,
                       memory_limit='32GB')
client = Client(cluster)
client

What happened: Looks like the memory_limit keyword argument used here sets the limit for the entire cluster (see screenshot below). If that’s the case, it’ll be helpful to add it to the LocalCluster documentation here.

Edit: It sets the limit per worker. My example is a special case because my computer has a maximum of 16GB, hence we can’t go beyond that (see @jcrist’s comments below for more details). It’ll still be useful to document the behavior.

Anything else we need to know?:

Possible causes of confusion:

StackOverflow question that surfaced this issue is here.

Screenshot:

Screenshot 2021-10-06 at 2 02 45 PM

Environment:

  • Dask version: 2021.09.1
  • Python version: 3.9.7
  • Operating System: macOS
  • Install method (conda, pip, source): conda

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:13 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
jcristcommented, Oct 6, 2021

I just noticed that too. This is because we take min of the user input and the total available system memory.

https://github.com/dask/distributed/blob/defe454f63199799b403a3ddeee04b473adf0dfd/distributed/worker.py#L3805

So if your machine has 16 GiB of RAM, each worker is limited to a max of 16 GiB of RAM even if memory_limit="32 GiB". This is what happened in @pavithraes case above, and I think is the correct behavior (but could also be called out in the docstring).

0reactions
pavithraescommented, Aug 29, 2022

Closing as completed. Thanks, @crislanarafael!

Read more comments on GitHub >

github_iconTop Results From Across the Web

API — Dask.distributed 2022.12.1 documentation
Any extra keywords are passed from Client to LocalCluster in this case. ... Client will create a LocalCluster object, passing any extra keyword...
Read more >
Managing worker memory on a dask localcluster
Now in my error messages I keep seeing a reference to a 'memory_limit=' keyword parameter. However I've searched the dask documentation ...
Read more >
Understanding Performance — MiniAn documentation
The argument n_workers controls the number of parallel processes (workers) that will be used for computation. Almost all computations in MiniAn can benefit ......
Read more >
Advanced Schedulers — ESPEI 0.7.11 documentation
Advanced Schedulers¶. ESPEI uses dask-distributed for parallelization and provides an easy way to deploy clusters locally via TCP with the mcmc.scheduler: ...
Read more >
Dask Distributed Release 1.13.0 - Matthew Rocklin
I'm pleased to announce a release of Dask's distributed scheduler, ... to different parts of the cluster with a workers= keyword argument.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found