Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Starting local cluster from within an embedded interpreter

See original GitHub issue

I’ve experienced this issue with the Python interpreter in a beta build for the upcoming Gatan Digital Micrograph release in particular, but it is probably applicable to other cases where a Python interpreter is embedded.

Starting a process-based local cluster with the normal method spawns new instances of the host application, at least on Windows. That is not the intended behavior when starting a cluster. As far as I could see, this is a result of the way how the multiprocessing module kicks off new processes.

As an alternative solution, we’ve developed a launch method that uses the subprocess module: https://github.com/LiberTEM/LiberTEM/blob/master/src/libertem/executor/dask.py#L148

Since this is something not specific to LiberTEM, but any dask.distributed user from within an embedded interpreter, could this be something useful to add to dask.distributed?

The code in LiberTEM is not 100 % perfect yet because it tends to open a swarm of temporary shell windows, but at least it works, including in our tests for Linux and Windows Continuous Integration. Probably some Windows-specific parameters have to be set for subprocess.Popen() to prevent the stray shell windows.

Issue Analytics

State:
Created 5 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

2reactions

itamarstcommented, Dec 7, 2018

I believe the solution is to use https://docs.python.org/3/library/multiprocessing.html#multiprocessing.set_executable - you can set a different Python interpreter for multiprocessing to use.

0reactions

uelluecommented, Dec 12, 2018

I confirmed that the method suggested by @itamarst works for my case. Thank you again! In #2409 there’s a short info on that for the documentation.

Read more comments on GitHub >

Top Results From Across the Web

MRJob using a different Python interpreter for local vs. hadoop

The --interpreter argument seems to determine the interpreter for both local and Hadoop. Is there another option to specify them individually?

Unable to import numpy from an embedded python script #1889

Hi, Ubuntu 14.04; Anaconda 2.7 Install instructions: https://anaconda.org/conda-forge/pybind11 g++ (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4 I ...

Spark Interpreter for Apache Zeppelin

Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll...

1. Embedding Python in Another Application ... - Python Docs

The difference is that when you extend Python, the main program of the application is still the Python interpreter, while if you embed...

Inject an Executable Script into a Container in Kubernetes

Starting local Kubernetes v1.7.0 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs.

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Could dask-mpi run the client script too?

memory leak with min/max aggregation of huge array