question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Program hangs when instantiating a GP using multiprocessing

See original GitHub issue

I’m trying to do what seems like a simple task: use multiprocessing to parallelize the optimize() call over many unique GPs. Here’s a minimal example of what I’m trying to do.

from GPy.core.gp import GP
from GPy.kern import White
from GPy.likelihoods.gaussian import Gaussian
from GPy.inference.latent_function_inference.laplace import Laplace
from multiprocessing import Pool
import numpy as np

# Wrapper needed so the function is pickleable, which is required for multiprocessing.Pool
def opt_wrapper(gp):
   return gp.optimize() # Can replace with 'return 1' and program still hangs

size = 100 # Program works when this is low enough
inference_method = Laplace() # Program works when this is None
models = [GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method) for _ in range(1)]

print "Starting pool..."
pool = Pool(1)
print pool.map(opt_wrapper, models)
pool.close()
pool.join()

The program simply hangs after printing “Starting pool…” Annoyingly, it also results in a zombie process for each worker in the pool (just 1 in this example).

The program works just fine when size is less than about 60. However, when size is larger, it simply hangs after printing “Starting pool…” Note you can replace pool.map with the built-in map and it works just fine, so it seems to be an issue of creating a GP with Laplace over a certain size within a new process.

The program works just fine when any one of the following conditions are true:

  1. When size is less than about 60. For larger values, it hangs.
  2. When Laplace() is replaced with None. This is because the Gaussian likelihood then defaults to ExactGaussianInference(); however, my actual project’s likelihood is custom and requires Laplace().
  3. When pool.map is replaced with the built-in map.

Lastly, it still breaks when you replace return gp.optimize() with return 1. Similarly, the following program hangs (same imports):

def make_gp(dummy):
   inference_method = Laplace() # Again, program works when Laplace() becomes None
   gp = GP(X=np.arange(size).reshape(size,1), Y=np.arange(size).reshape(size,1), kernel=White(1), likelihood=Gaussian(), inference_method=inference_method)
   return 1

size = 100 # Again, program works when this is small
pool = Pool(1)
print pool.map(make_gp, ['dummy']) # Again, works with `map`
pool.close()
pool.join()

It seems to be an issue of instantiating/copying a GP–both with Laplace and above a certain size–within a new process. Seems highly odd and highly specific. Any help greatly appreciated.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:1
  • Comments:11 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
brendenpetersencommented, Mar 19, 2018

Hi @ahartikainen, I was not using Jupyter Notebook. Python was executed from command-line. So I don’t think those changes would fix the problem.

I’ve moved on from this project, but the issue was actually a limitation with Python’s multiprocessing, which uses OS pipes under the hood and is therefore limited by buffer sizes. This explains why the program works when size is small enough, as it puts it under the buffer size, and why it worked for @mzwiessele, whose OS likely had a different buffer size.

One (of many) explanations here: https://sopython.com/canon/82/programs-using-multiprocessing-hang-deadlock-and-never-complete/

0reactions
patel-zeelcommented, Feb 16, 2021

I had a similar problem while doing benchmarking on AMD and Intel CPUs. AMD was too bad with GPy multiprocessing but Intel did well. Does anyone have a similar experience?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Program hangs when using multiprocessing module
Program hangs when using multiprocessing module · I believe you need to retrieve all the items from the queue before worker.join() will return....
Read more >
Handling Hang in Python Multiprocessing - Sefik Ilkin Serengil
I had often hang and deadlock problems when I use its multiprocessing module. This might be the worst case for a production application...
Read more >
MIPS32® interAptiv™ Multiprocessing System Software User's ...
The interAptiv™ Multiprocessing System is a high performance multi-core microprocessor with best in class power efficiency for use in ...
Read more >
scikit-learn user guide
'forkserver' mode globally for your program: Insert the following instructions in your main script: import multiprocessing.
Read more >
Zynq UltraScale+ MPSoC: Software Developers Guide - Xilinx
better understanding of programming in different processor modes like symmetric multi- processing (SMP), asymmetric multi-processing (AMP), ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found