question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Client.gather not awaiting futures created with Client.run

See original GitHub issue

Hi, I’m struggling with running async functions concurrently using dask distributed. I’m attempted to use client.run to launch some tasks on dedicated workers in conjunction with client.gather to retrieve the results. As far as I can tell from reading the docs, my approach should be correct hence I am raising it as an issue here; however, I may be missing something, in which case the docs could potentially be improved.

For context, I’m building an application in which user defined classes represent nodes within a process graph (think manufacturing plant etc). The nodes execute bespoke code and communicate data via channels (e.g. dask.distributed.Queue). Nodes in the graph may have a large memory footprint (i.e. they contain trained machine learning models). Each node should execute all of its iterations on a single worker until it receives a termination signal. To satisfy this requirement I am using client.run and specifying a single worker, assigning workers to nodes in a round robin fashion. I realise this pattern may not be ideal and is perhaps a bit of a hack; I’m currently exploring how to implement this.

I have created a minimal example which follows the same pattern as my actual application code and reproduces the same issue.

What happened: I create a list of futures by calling client.run in a loop and passing different arguments to a function targetted to execute on specific workers. I subsequently call client.gather to get back the results from this set of futures. Instead of waiting for the functions to execute, control continues past the client.gather call and the application exits with the below exception.

/usr/lib/python3.7/asyncio/events.py:88: RuntimeWarning: coroutine 'Client._run' was never awaited
  self._context.run(self._callback, *self._args)

If I add in a call to dask.distributed.wait(futures) before the call to Client.gather then exactly the same behaviour is observed.

What you expected to happen: I expect that calling Client.gather will wait for all the futures to execute and return the results from the futures rather than just returning the futures themselves. Additionaly, I expect that if I call dask.distributed.wait on the list of futures, that all the futures passed in will be awaited.

Minimal Complete Verifiable Example:

import asyncio
from itertools import cycle
import time

from dask.distributed import Client, wait


SLEEP_TIME = 2.0  # Time for coroutine to sleep in seconds


async def foo(x: int, sleep_time: float = SLEEP_TIME) -> int:
    """Sleeps then returns the input value."""
    print(f"Got {x}. Sleeping for {sleep_time}s.")
    await asyncio.sleep(sleep_time)
    print(f"Done for {x}!")
    return x


def bar(x: int, sleep_time: float = SLEEP_TIME) -> int:
    """Sleeps then returns the input value (blocking version)."""
    print(f"Got {x}. Sleeping for {sleep_time}s.")
    time.sleep(sleep_time)
    print(f"Done for {x}!")
    return x


async def main() -> None:
    """Entry point for dask run."""
    # Create an async client using the local machine as a cluster.
    client = await Client(asynchronous=True)

    # Get the list of workers from the scheduler.
    workers = cycle(client.scheduler_info()["workers"])

    t_start = time.time()

    # Assign the functions to workers in round robin fashion.
    futures = [client.run(foo, i, workers=[next(workers)]) for i in range(3)]
    # futures = [client.run(bar, i, workers=[next(workers)]) for i in range(3)]

    # Await all the futures using gather.
    # wait(futures)  # Explicitely waiting for all the futures makes no difference.
    # NOTE : Futures objects are not awaited when calling `client.gather`.
    results = await client.gather(futures)
    # NOTE : Using `asyncio.gather` awaits the futures as expected.
    # results = await asyncio.gather(*futures)

    # Disply the collected results.
    print(f"results: {results}")
    print(f"Execution took {time.time() - t_start}s.")

    # Close the client connection.
    await client.close()


if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

Anything else we should know

  • If the call to Client.gather is replaced with asyncio.gather then the expected behaviour is observed.
  • Replacing the async function foo with the blocking function bar gives the same results.

Environment:

  • Dask version: 2021.6.0
  • Python version: 3.7.10
  • Operating System: Ubuntu 20.04
  • Install method (conda, pip, source): pip

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
chrisk314commented, Jun 24, 2021

@fjetter thanks for that suggestion with your example above. That approach hadn’t occurred to me. I will try it out today. I’d like to try it with the k8s deployment of dask but I’m finding that when I deploy the default helm chart on my machine the workers cannot discover the scheduler for some reason. Will open a separate issue on that front if I can’t figure out the problem.

In terms of the pubsub stuff I did notice that somewhere. At the moment Queue fits nicely for us behind the abstract interface we’ve defined with our Channel class and seems in principle that it should do the job - if I can get all this stuff working nicely.

0reactions
jrbourbeaucommented, Oct 14, 2021

This should be closed via https://github.com/dask/distributed/pull/5151. @chrisk314 feel free to re-open if that’s not the case

Read more comments on GitHub >

github_iconTop Results From Across the Web

aiohttp RuntimeError: await wasn't used with future
I tried reproducing your issue by adding import asyncio, aiohttp and defining urls as urls = ['https://example.com'] . For me the script runs...
Read more >
Futures - Dask documentation
Though Dask futures is one of Dask's more powerful APIs, it is often not needed ... You can gather many results concurrently using...
Read more >
Help with asyncio program freezing during requests - Async-SIG
I'm trying to write a program to grab multiple files over http. I'm writing it using asyncio (with httpx as the HTTP library)...
Read more >
Python Asyncio Part 2 – Awaitables, Tasks, and Futures
The Python async def keyword creates a callable object with a name, when the object is called the code block of the function...
Read more >
Redesign discussion: Launch Tasks from Tasks #5671 - GitHub
The first way to solve this problem today with Dask is to have the client invoke client.submit(get_children, node) for every node and wait...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found