Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can not save model using model.save following multi_gpu_model

See original GitHub issue

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

This is a short and simple issue. Following upgrading to Keras 2.0.9 I have been using the multi_gpu_model utility but I can seem to save my models or best weights using model.save.

The error I get is

TypeError: can’t pickle module objects

I suspect there is some problem gaining access to the model object. Is there a work around this issue?

Issue Analytics

State:
Created 6 years ago
Comments:12 (3 by maintainers)

Top GitHub Comments

16reactions

fcholletcommented, Nov 10, 2017

For now we recommend saving the original (template) model instead of the parallel model. I.e. call save on the model you passed to multi_gpu_model, not the model returned by it.

Both models share the same weights.

4reactions

Heisenberg0391commented, Dec 8, 2018

So, as mentioned above, i should train with parallel_model but save the origin model. But what if i want save weights on every epoch as checkpoints using a callback, what should i do?

Top Results From Across the Web

Can not save model using model.save following ...

To be honest, the easiest approach to this is to actually examine the multi gpu parallel model using parallel_model.summary().

Error occurs when saving model in multi-gpu settings

save_pretrained(output_dir) , I tried to load the saved model using .from_pretrained(output_dir) , but got the following error message. OSError: Unable to load ...

Getting Started with Distributed Data Parallel - PyTorch

When using DDP, one optimization is to save the model in only one process and then load it to all processes, reducing write...

Multi GPU Model Training: Monitoring and Optimizing

Do you struggle with monitoring and optimizing the training of Deep Neural Networks on multiple GPUs? If yes, you're in the right place....

multi_gpu_model - TensorFlow for R - RStudio

If the model is not defined under any preceding device scope, you can still rescue it by ... To save the multi-gpu model,...