question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Provide clearer message for multiprocess error

See original GitHub issue

See discussion on #2515 for details. In summary, if a user tries to use the Client object or uses multiprocessing in an unexpected way (especially when first starting out) they can run in to this error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

The solution when this is encountered is typically making sure that your script is started in a if __name__ == '__main__':. If this doesn’t apply to you then you usually have to do some fancier handling. Is there a way that the above exception can be caught by distributed/dask and provide a simpler or more clear message?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:8
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

52reactions
mrocklincommented, Mar 8, 2019

To be clear, this fails if run within a script (it works fine in an interpretter):

from dask.distributed import Client
client = Client()
# user code follows

The solution is this

from dask.distributed import Client

if __name__ == '__main__':
    client = Client()
    # user code follows

This is exactly the same problem that exists with anything in Python that spins up processes, like a multiprocessing.Pool()

Another alternative is to not use processes with Client(process=False), but that has other performance implications

5reactions
djhoesecommented, May 9, 2021

@rgoggins This has to do with how the additional/child processes are created. Python has to “import” your script(s) in every child process. If you don’t put initialization code (code that should only be run once) into the if __name__ == "__main__": block then it gets run for every child process (at “import” time). This can end up with an infinite recursion as each process creates child processes that create more child processes and so on.

This is how I understand it at least.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Provide clearer message for multiprocess error #2520 - GitHub
RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means...
Read more >
Multiprocessing example giving AttributeError - Stack Overflow
This problem seems to be a design feature of multiprocessing.Pool. See https://bugs.python.org/issue25053. For some reason Pool does not always work with ...
Read more >
multiprocessing — Process-based parallelism — Python 3.11 ...
args[0] will give the message as a byte string. exception multiprocessing.AuthenticationError¶. Raised when there is an authentication error ...
Read more >
Why your multiprocessing Pool is stuck (it's full of sharks!)
You're using multiprocessing to run some code across multiple processes, and it just—sits there. It's stuck.
Read more >
Multi-Process Service :: GPU Deployment and Management ...
This mechanism provides a facility to fractionalize GPU memory across MPS clients that run on the specific GPU, which enables scheduling and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found