backend argmax has none for gradients. Can you even define one?
See original GitHub issueI am using Keras.Backend.armax()
in a gamma layer. The model compiles fine but throws an error during fit().
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
My model:
latent_dim = 512
encoder_inputs = Input(shape=(train_data.shape[1],))
encoder_dense = Dense(vocabulary, activation='softmax')
encoder_outputs = Embedding(vocabulary, latent_dim)(encoder_inputs)
encoder_outputs = LSTM(latent_dim, return_sequences=True)(encoder_outputs)
encoder_outputs = Dropout(0.5)(encoder_outputs)
encoder_outputs = encoder_dense(encoder_outputs)
encoder_outputs = Lambda(K.argmax, arguments={'axis':-1})(encoder_outputs)
encoder_outputs = Lambda(K.cast, arguments={'dtype':'float32'})(encoder_outputs)
encoder_dense1 = Dense(train_label.shape[1], activation='softmax')
decoder_embedding = Embedding(vocabulary, latent_dim)
decoder_lstm1 = LSTM(latent_dim, return_sequences=True)
decoder_lstm2 = LSTM(latent_dim, return_sequences=True)
decoder_dense2 = Dense(vocabulary, activation='softmax')
decoder_outputs = encoder_dense1(encoder_outputs)
decoder_outputs = decoder_embedding(decoder_outputs)
decoder_outputs = decoder_lstm1(decoder_outputs)
decoder_outputs = decoder_lstm2(decoder_outputs)
decoder_outputs = Dropout(0.5)(decoder_outputs)
decoder_outputs = decoder_dense2(decoder_outputs)
model = Model(encoder_inputs, decoder_outputs)
model.summary()
Model summary for easy visualizing:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_7 (InputLayer) (None, 32) 0
_________________________________________________________________
embedding_13 (Embedding) (None, 32, 512) 2018816
_________________________________________________________________
lstm_19 (LSTM) (None, 32, 512) 2099200
_________________________________________________________________
dropout_10 (Dropout) (None, 32, 512) 0
_________________________________________________________________
dense_19 (Dense) (None, 32, 3943) 2022759
_________________________________________________________________
lambda_5 (Lambda) (None, 32) 0
_________________________________________________________________
lambda_6 (Lambda) (None, 32) 0
_________________________________________________________________
dense_20 (Dense) (None, 501) 16533
_________________________________________________________________
embedding_14 (Embedding) (None, 501, 512) 2018816
_________________________________________________________________
lstm_20 (LSTM) (None, 501, 512) 2099200
_________________________________________________________________
lstm_21 (LSTM) (None, 501, 512) 2099200
_________________________________________________________________
dropout_11 (Dropout) (None, 501, 512) 0
_________________________________________________________________
dense_21 (Dense) (None, 501, 3943) 2022759
=================================================================
Total params: 14,397,283
Trainable params: 14,397,283
Non-trainable params: 0
_________________________________________________________________
I googled for the solution but almost all were about a faulty model. Some recommended to not use functions causing that are causing issues. However, as you can see, I cannot create this model without K.argmax (If you know any other way then do tell me).
Also, how can you even define gradient of argmax! I am guessing its an issue in Keras, if not, pls tell me how to define its gradient.
Issue Analytics
- State:
- Created 5 years ago
- Comments:21 (3 by maintainers)
Top Results From Across the Web
python - keras argmax has none for gradients. How to define ...
1 Answer 1 ... For obvious reasons there is no gradient for the Argmax function; How would that even be defined? In order...
Read more >tf.custom_gradient | TensorFlow v2.11.0
Decorator to define a function with a custom gradient.
Read more >JAX Frequently Asked Questions (FAQ)
Gradients contain NaN where using where #. If you define a function using where to avoid an undefined value, if you are not...
Read more >Backend - Keras 2.0.6. Documentation
TensorFlow is an open-source symbolic tensor manipulation framework ... If you have run Keras at least once, you will find the Keras configuration...
Read more >Gradient for tf.multinomial - Google Groups
ValueError : An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable)....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I have the same problem. There is no any problem for train and evaluation, and Ok for saving the model in H5. However, when loading the saved model, the error message pops up: ValueError: An operation has
None
for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. Do you have idea to fix this issue. Otherwise, the model cannot be used for prediction. Thank you.I implemented this solution and it worked for me. This is all you will need:
Save model to JSON
model_json = model.to_json() with open(“model.json”, “w”) as json_file: json_file.write(model_json)
serialize weights to HDF5
model.save_weights(“model.h5”) print(“Saved model to disk”)
Load JSON model
json_file = open(‘model.json’, ‘r’) loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json)
load weights into new model
loaded_model.load_weights(“model.h5”) print(“Loaded model from disk”)