question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CuDNNGRU/LSTM weights trained on GPU can't be used on GRU/LSTM (i.e CPU versions)

See original GitHub issue

If you train a model on the GPU using the CuDNN (GRU or LSTM) layers and save the weights, it is not possible to load those weights into their respective CPU variants.

Is this a bug or expected? I tried messing about with implementation=0, 1, 2 for the GRU layer but this didn’t seem to help.

The code below raises the following exception

ValueError: Dimension 0 in both shapes must be equal, but are 48 and 96 for 'Assign_39' (op: 'Assign') with input shapes: [48], [96].
import numpy as np
import keras
from keras import layers
from keras.utils.np_utils import to_categorical

T = 10
k = 3
batch_size = 32
classes = 5

X = np.random.random((32, T, k))
y = to_categorical(np.random.randint(0, classes, size=(32, )), num_classes=classes)

model=keras.models.Sequential()
model.add(layers.InputLayer(input_shape=(T, 3)))
model.add(layers.CuDNNGRU(16 ,return_sequences=False))
model.add(layers.Dense(classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',optimizer='sgd')

model.fit(X, y)

model.save_weights('GPU.weights')

cpu_model=keras.models.Sequential()
cpu_model.add(layers.InputLayer(input_shape=(T, 3)))
cpu_model.add(layers.GRU(16 ,return_sequences=False))
cpu_model.add(layers.Dense(classes, activation='softmax'))
cpu_model.compile(loss='categorical_crossentropy',optimizer='sgd')
cpu_model.load_weights('GPU.weights')
  • [x ] Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps

  • [x ] If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • [ x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:2
  • Comments:14 (5 by maintainers)

github_iconTop GitHub Comments

9reactions
bzamecnikcommented, Jan 18, 2018

Activations for CuDNN LSTM/GRU are hard-coded in CuDNN and cannot be changed from Keras. They correspond to activation='tanh' and recurrent_activation='sigmoid' (slightly different than default hard_sigmoid in Keras).

5reactions
bzamecnikcommented, Jun 1, 2018

@rsmith49: CuDNNLSTM has 2x biases. In order to use weights/biases from one implementation in another you need to perform conversion. If you just dump the whole model weights and load again, the conversion is performed automatically (preferred). See the tests for examples.

When picking weights from just one layer and setting to another layer you need to do it manually. https://github.com/keras-team/keras/blob/master/keras/engine/saving.py#L468. But I’d better load the model since the preprocess_weights_for_loading() function is more of an internal function that part of the API.

from keras.engine.saving import preprocess_weights_for_loading

cudnn_weights = cudnn_lstm_model.get_weights()
weights2 = preprocess_weights_for_loading(lstm_layer, cudnn_weights) # target layer, source weights
lstm_model.set_weights(cudnn_weights)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Handling big models - Hugging Face
Loading weights · first we use the maximum space available on the GPU(s) · if we still need space, we store the remaining...
Read more >
Tensorflow loading weights on CPU when model is trained on ...
I wrote a Bert model in Colab and I trained it using GPU and downloaded the weights for further inference. For ...
Read more >
Frequently Asked Questions
If you have the GPU version of CNTK installed then your Keras code will ... To “freeze” a layer means to exclude it...
Read more >
How To Use GPU with PyTorch – Weights & Biases - Wandb
We'll use Weights and Biases that lets us automatically log all our GPU and CPU utilization metrics. This makes it easy to monitor...
Read more >
Machine Learning on GPU - GitHub Pages
All of the major deep learning Python libraries support the use of GPUs and ... a GPU will not be any faster than...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found