question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can not save model using model.save following multi_gpu_model

See original GitHub issue

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

This is a short and simple issue. Following upgrading to Keras 2.0.9 I have been using the multi_gpu_model utility but I can seem to save my models or best weights using model.save.

The error I get is

TypeError: can’t pickle module objects

I suspect there is some problem gaining access to the model object. Is there a work around this issue?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:12 (3 by maintainers)

github_iconTop GitHub Comments

16reactions
fcholletcommented, Nov 10, 2017

For now we recommend saving the original (template) model instead of the parallel model. I.e. call save on the model you passed to multi_gpu_model, not the model returned by it.

Both models share the same weights.

4reactions
Heisenberg0391commented, Dec 8, 2018

So, as mentioned above, i should train with parallel_model but save the origin model. But what if i want save weights on every epoch as checkpoints using a callback, what should i do?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can not save model using model.save following ...
To be honest, the easiest approach to this is to actually examine the multi gpu parallel model using parallel_model.summary().
Read more >
Error occurs when saving model in multi-gpu settings
save_pretrained(output_dir) , I tried to load the saved model using .from_pretrained(output_dir) , but got the following error message. OSError: Unable to load ...
Read more >
Getting Started with Distributed Data Parallel - PyTorch
When using DDP, one optimization is to save the model in only one process and then load it to all processes, reducing write...
Read more >
Multi GPU Model Training: Monitoring and Optimizing
Do you struggle with monitoring and optimizing the training of Deep Neural Networks on multiple GPUs? If yes, you're in the right place....
Read more >
multi_gpu_model - TensorFlow for R - RStudio
If the model is not defined under any preceding device scope, you can still rescue it by ... To save the multi-gpu model,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found