Starting local cluster from within an embedded interpreter
See original GitHub issueI’ve experienced this issue with the Python interpreter in a beta build for the upcoming Gatan Digital Micrograph release in particular, but it is probably applicable to other cases where a Python interpreter is embedded.
Starting a process-based local cluster with the normal method spawns new instances of the host application, at least on Windows. That is not the intended behavior when starting a cluster. As far as I could see, this is a result of the way how the multiprocessing
module kicks off new processes.
As an alternative solution, we’ve developed a launch method that uses the subprocess
module: https://github.com/LiberTEM/LiberTEM/blob/master/src/libertem/executor/dask.py#L148
Since this is something not specific to LiberTEM, but any dask.distributed user from within an embedded interpreter, could this be something useful to add to dask.distributed?
The code in LiberTEM is not 100 % perfect yet because it tends to open a swarm of temporary shell windows, but at least it works, including in our tests for Linux and Windows Continuous Integration. Probably some Windows-specific parameters have to be set for subprocess.Popen()
to prevent the stray shell windows.
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (2 by maintainers)
Top GitHub Comments
I believe the solution is to use https://docs.python.org/3/library/multiprocessing.html#multiprocessing.set_executable - you can set a different Python interpreter for
multiprocessing
to use.I confirmed that the method suggested by @itamarst works for my case. Thank you again! In #2409 there’s a short info on that for the documentation.