Cannot call `load_model` on network trained using `multi_gpu`.
See original GitHub issueI came across your post on Medium and was instantly hooked. Nice job!
I’ve been developing a series of deep learning experiments that use only a single GPU and decided to switch them over to a multi-GPU setting. After training the models are serialized to disk via model.save
.
However, when I try to call load_model
on to load the pre-trained network for disk I get an error:
[INFO] loading model... Traceback (most recent call last): File "rank_accuracy.py", line 28, in <module> model = load_model(config.MODEL_PATH) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/models.py", line 140, in load_model model = model_from_config(model_config, custom_objects=custom_objects) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/models.py", line 189, in model_from_config return layer_from_config(config, custom_objects=custom_objects) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/utils/layer_utils.py", line 34, in layer_from_config return layer_class.from_config(config['config']) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2395, in from_config process_layer(layer_data) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2390, in process_layer layer(input_tensors[0]) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/engine/topology.py", line 517, in __call__ self.add_inbound_node(inbound_layers, node_indices, tensor_indices) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/engine/topology.py", line 571, in add_inbound_node Node.create_node(self, inbound_layers, node_indices, tensor_indices) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/engine/topology.py", line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0])) File "/home/ubuntu/.virtualenvs/dlbook/local/lib/python2.7/site-packages/keras/layers/core.py", line 587, in call return self.function(x, **arguments) File "/home/ubuntu/deep-learning-book/dataset_to_hdf5/multi_gpu.py", line 9, in get_slice shape = tf.shape(data) NameError: global name 'tf' is not defined
Looking at multi_gpu.py
it’s clear that TensorFlow is imported so I’m not sure why the error is being generated.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:1
- Comments:9 (1 by maintainers)
Top Results From Across the Web
Keras model trained with Multi GPUs not loading on non ...
For the first ValueError the problem ist that you have to add multi_gpu_model(model, gpus=2) to the function call as the default value is...
Read more >Efficient Training on Multiple GPUs
When training on a single GPU is too slow or the model weights don't fit in a single GPUs memory we use a...
Read more >How-To: Multi-GPU training with Keras, Python, and deep ...
Enabling multi-GPU training with Keras is as easy as a single function call — I recommend you utilize multi-GPU training whenever possible. In ......
Read more >Multi-GPU and distributed training
Specifically, this guide teaches you how to use the tf.distribute API to train Keras models on multiple GPUs, with minimal changes to your ......
Read more >Multi GPU Model Training: Monitoring and Optimizing
In this article, we will discuss multi GPU training with Pytorch Lightning and find out the best practices that should be adopted to ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I have an ugly but functional workaround:
Change the return statement to the following code:
This monkey-patches the old model’s save onto the new model’s save (calling the parallel model’s save will call the simple model’s save)
When loading, you must load the simple model before creating the parallel model.
@aman-tiwari and anyone else who may stumble across this: you can recover an already saved multi GPU model simply. Temporarily edit your virtualenv’s
keras/layers/core.py
to have the necessary import:import tensorflow as tf
. Then: