question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Starting local cluster from within an embedded interpreter

See original GitHub issue

I’ve experienced this issue with the Python interpreter in a beta build for the upcoming Gatan Digital Micrograph release in particular, but it is probably applicable to other cases where a Python interpreter is embedded.

Starting a process-based local cluster with the normal method spawns new instances of the host application, at least on Windows. That is not the intended behavior when starting a cluster. As far as I could see, this is a result of the way how the multiprocessing module kicks off new processes.

As an alternative solution, we’ve developed a launch method that uses the subprocess module: https://github.com/LiberTEM/LiberTEM/blob/master/src/libertem/executor/dask.py#L148

Since this is something not specific to LiberTEM, but any dask.distributed user from within an embedded interpreter, could this be something useful to add to dask.distributed?

The code in LiberTEM is not 100 % perfect yet because it tends to open a swarm of temporary shell windows, but at least it works, including in our tests for Linux and Windows Continuous Integration. Probably some Windows-specific parameters have to be set for subprocess.Popen() to prevent the stray shell windows.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
itamarstcommented, Dec 7, 2018

I believe the solution is to use https://docs.python.org/3/library/multiprocessing.html#multiprocessing.set_executable - you can set a different Python interpreter for multiprocessing to use.

0reactions
uelluecommented, Dec 12, 2018

I confirmed that the method suggested by @itamarst works for my case. Thank you again! In #2409 there’s a short info on that for the documentation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MRJob using a different Python interpreter for local vs. hadoop
The --interpreter argument seems to determine the interpreter for both local and Hadoop. Is there another option to specify them individually?
Read more >
Unable to import numpy from an embedded python script #1889
Hi, Ubuntu 14.04; Anaconda 2.7 Install instructions: https://anaconda.org/conda-forge/pybind11 g++ (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4 I ...
Read more >
Spark Interpreter for Apache Zeppelin
Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll...
Read more >
1. Embedding Python in Another Application ... - Python Docs
The difference is that when you extend Python, the main program of the application is still the Python interpreter, while if you embed...
Read more >
Inject an Executable Script into a Container in Kubernetes
Starting local Kubernetes v1.7.0 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found