question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue loading Keras LSTM in subprocess

See original GitHub issue

I’m trying to train an RL model in parallel than relies on the output of a pre-trained Keras model. I’m trying to construct the Keras model in each subprocess and then set the weights from a saved file. The model is initialized as below:

print("building...")
x = Input(shape=(None, self.z_dim+self.action_dim))
print('input done')
lstm = LSTM(self.hidden_units, return_sequences=True, return_state=True)
print('lstm done')
lstm_out, _, _ = lstm(x)
print('dense started')
mdn = Dense(self.n_mixtures * (3*self.z_dim))(lstm_out)
print('dense done')
rnn = Model(x, mdn)

However, the execution hangs on the line lstm_out, _, _ = lstm(x). The model initializes without problems in the main module. I am also able to initialize other models (that don’t have LSTM layers) in subprocesses without problem. Does anyone have any idea what the problem could be?

Let me know if there’s any other relevant info I can provide. I’m up to date on Keras and TF, and I’ve only tested this issue on Mac OS.

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps

  • If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
bglick13commented, May 2, 2018

I’m actually having some positive results using multiprocessing and making sure I run

multiprocessing.set_start_method('spawn')

At the start of my main script! Seems to allow me to get past the tensorflow locking issue. I’ll be curious to hear if it works for you as well.

0reactions
jlopezpenacommented, May 2, 2018

I haven’t tried that one, will give it a go tomorrow when I get back to office! I was trying using fork and forkserver starting methods, but that didn’t help, somehow I never thought about using the spawn method.

Since you are working on celery anyway, I think another (safer?) option for you would be to replace multiprocessing by billiard, which should be more amenable to celery tasks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Keras + Tensorflow and Multiprocessing in Python
From my experience - the problem lies in loading Keras to one process and then spawning a new process when the keras has...
Read more >
Multi-worker training with Keras | TensorFlow Core
This tutorial demonstrates how to perform multi-worker distributed training with a Keras model and the Model.fit API using the tf.distribute.
Read more >
Airline Passengers Keras LSTM - Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from airline passengers.
Read more >
How to Save and Load Your Keras Deep Learning Model
In this post, you will discover how to save your Keras models to files and load them up again to make predictions.
Read more >
Natural Language Processing – Weights & Biases - Wandb
LSTM units in RNNs help combat the vanishing gradient problem. by using neurons ... as np import subprocess from tensorflow.keras.preprocessing import text, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found