question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Client should call LoopRunner.stop()

See original GitHub issue

What happened: distributed.Client doesn’t cleanup LoopRunner. Because of that tornado.ioloop.IOLoop and related threads remain running after closing the Client. The issue appears if run multiple clients with asynchronous=False. This happens because Client calls LoopRunner.start() but doesn’t call LoopRunner.stop().

What you expected to happen: distributed.Client should stop LoopRunner to cleanup resources.

Minimal Complete Verifiable Example:

import threading
import time
from distributed import LocalCluster, Client
from distributed.utils import LoopRunner


def main():
    threads_before = threading.enumerate()

    runner = LoopRunner(asynchronous=False)
    runner.start()

    cluster = LocalCluster(loop=runner.loop, asynchronous=False)
    client = Client(cluster, loop=runner.loop, asynchronous=False)

    # NOTE: uncomment to correctly cleanup resources.
    # client._should_close_loop = True

    client.close()
    cluster.close()

    runner.stop()

    # Wait until daemon threads finished.
    time.sleep(5)

    threads_after = threading.enumerate()

    if threads_before != threads_after:
        print('Before:', threads_before)
        print('After:', threads_after)
        raise AssertionError


if __name__ == '__main__':
    main()

If you set client._should_close_loop = True then it will call LoopRunner.stop(). Check https://github.com/dask/distributed/blob/master/distributed/client.py#L1435-L1436.

Anything else we need to know?: I use old distributed/dask version, but the latest sources has the same code in this part. LoopRunner counts start() calls, so there must be the same number start/stop calls to correctly stop IO loop.

Environment:

  • Dask version: 1.2.2
  • Python version: 2.7.17 (Anaconda)
  • Operating System: Linux Mint 18.3 Sylvia
  • Install method (conda, pip, source): conda/pip

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
skozlovfcommented, Aug 26, 2020

Here is more advanced example demonstrating the issue, py3-based.


import threading
import time
from distributed import LocalCluster, Client
from distributed.utils import LoopRunner


def print_list(title, lst):
    print('%s (num threads: %d):' % (title, len(lst)))
    for i, x in enumerate(lst):
        print('  %d: %s' % (i, x))


def run(cluster_runner, num_iter=3):
    for i in range(num_iter):
        print('Iteration %d/%d:' % (i + 1, num_iter))

        threads_before = threading.enumerate()

        loop_runner = LoopRunner(asynchronous=False)
        loop_runner.start()

        cluster_runner(loop_runner.loop)

        loop_runner.stop()

        # Allow IO loop process events and wait until daemon threads finished.
        time.sleep(1)

        threads_after = threading.enumerate()

        print_list('Threads Before', threads_before)
        print_list('Threads After', threads_after)


def direct(loop):
    cluster = LocalCluster(loop=loop, asynchronous=False)
    client = Client(cluster, loop=loop, asynchronous=False)
    # client._should_close_loop = True  # (un)comment
    time.sleep(1)  # Allow IO loop process events.
    client.close()
    cluster.close()


def context(loop):
    with LocalCluster(loop=loop, asynchronous=False) as cluster:
        client = Client(cluster, loop=loop, asynchronous=False)
        # client._should_close_loop = True  # (un)comment
        time.sleep(1)  # Allow IO loop process events.
        client.close()


def main():
    run(context)


if __name__ == '__main__':
    main()

Try run(direct) and run(context) with/without client._should_close_loop = True. You will see growing number of threads, except run(context) with client._should_close_loop = True.

0reactions
skozlovfcommented, Sep 5, 2020

Ok, will create a PR for Client soon, but not sure about LocalCluster yet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Source code for distributed.client - Dask documentation
This will be called regardless of if the future completes successfully, errs, or is cancelled The callback is executed in a separate thread....
Read more >
Advanced Strategies for Testing Async Code in Python | by Agari
I created a helper class that can assist in most of these circumstances, which I call LoopRunner: The run() method of this Thread...
Read more >
distributed.client — Dask.distributed 2.11.0 documentation
client.gather(c) # doctest: +SKIP 33 You can also call Client with no ... _periodic_callbacks.values(): pc.stop() if self.asynchronous: future = self.
Read more >
roslibpy - Read the Docs
Service calls (client). ... This operation should be reserved to be executed at the very end of your ... python ros-service-call-set-bool.py.
Read more >
Advanced Strategies for Testing Async Code in Python - Agari
Running in the loop runner ensures that the test environment is close ... In fact, asyncio.run_coroutine_threadsafe() must be called from a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found