question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dask not respecting worker resources

See original GitHub issue

I believe this issue is related to this SO post.

I’m finding that, despite resource restrictions, the dask-scheduler will assign keys to worker nodes even when I’m specifying otherwise.

For example:

from dask.distributed import LocalCluster, Client
cluster = LocalCluster(n_workers=0)
cluster.start_worker(ncores=1)  #  resources={'CPU': 1}, no resources to compute

client = Client(cluster.scheduler.address)
fut = client.submit(lambda x=1: x+1, resources={'CPU':1})
client.who_has()  # here we expect to see nothing as no workers have resources

fut2 = client.submit(lambda x=1: x+1)
client.who_has() # yep, assigned because it doesn't need resources

import dask.dataframe as dd
import pandas.util.testing as tm
import os 
files = dd.from_pandas(tm.makeTimeSeries(1000, freq='10ms'), npartitions=4).to_csv('example*.csv')

# simulate some pipeline where data is read and transformed 
fut3 = client.compute(dd.read_csv('example*.csv').to_delayed(), resources={'CPU':1})
client.who_has()  
# now we see that the scheduler has placed the delayed keys 
# onto the worker even though it has no resources to compute.
#  "('from-delayed-238ec9c6404d8e52399becf66971834c', 2)": ('tcp://127.0.0.1:44429',),
#  "('pandas_read_text-read-block-from-delayed-238ec9c6404d8e52399becf66971834c', 2)": (),
[os.remove(f) for f in files]

Likewise, I’m finding similar behavior with methods like read_parquet.

dask.__version__
'1.2.2'

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:2
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
nuKscommented, Apr 21, 2021

I did checked with htop and resources are respected in my case, even though worker’s “Processing” tab displays multiples “ongoing task”, only the top one task in the worker task list is actually being processed.

I have Consumed > Total as well. I think it is normal, as consumed tasks amounts for all tasks that have moved through the worker:

  • Some may have moved to other worker, so “Consumed” > “Finished task”
  • There can be more “Processing” tasks than there are “Consumed”.
  • Total == “Most at a single time”. Still a bit confusing to me though, this is my interpretation.
0reactions
bw4szcommented, Aug 5, 2022

My experience here is the same as above. Consumed >total, but if you look at the call stacks, only the top task is actually being computed, which is good news.

image image

My conclusion is that resources behavior is acting properly, but that the ‘consumed’ label is just a bit confusing. My intuition was that dask would label the other tasks as ‘waiting’ and not ‘processing’, because all resources are occupied. Hope this helps others.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Worker Resources — Dask.distributed 2022.12.1 documentation
Dask allows you to specify abstract arbitrary resources to constrain how your tasks run on your workers. Dask does not model these resources...
Read more >
Dask worker resources for distributed workers - Stack Overflow
The resources keyword applies evenly to all processes. All workers will get a single HOST resource. In general for mature deployments on ...
Read more >
Troubleshooting and Optimizing Dask Resources | Saturn Cloud
Restart the Workers ... As you use a Dask cluster, some objects may not be cleared from memory when your code stops running....
Read more >
Understanding Dask Architecture: Client, Scheduler, Workers
Data messages: these are usually substantially larger than the administrative messages, and often represent Numpy arrays or Pandas dataframes. Dask uses a ...
Read more >
Understanding Work Stealing - Distributed - Dask Forum
Work stealing should respect resource constraints just like anything else. ... We could annotate downstream tasks to ensure that they are not stolen, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found