question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

backend argmax has none for gradients. Can you even define one?

See original GitHub issue

I am using Keras.Backend.armax() in a gamma layer. The model compiles fine but throws an error during fit().

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

My model:

latent_dim = 512
encoder_inputs = Input(shape=(train_data.shape[1],))
encoder_dense = Dense(vocabulary, activation='softmax')
encoder_outputs = Embedding(vocabulary, latent_dim)(encoder_inputs)
encoder_outputs = LSTM(latent_dim, return_sequences=True)(encoder_outputs)
encoder_outputs = Dropout(0.5)(encoder_outputs)
encoder_outputs = encoder_dense(encoder_outputs)
encoder_outputs = Lambda(K.argmax, arguments={'axis':-1})(encoder_outputs)
encoder_outputs = Lambda(K.cast, arguments={'dtype':'float32'})(encoder_outputs)

encoder_dense1 = Dense(train_label.shape[1], activation='softmax')
decoder_embedding = Embedding(vocabulary, latent_dim)
decoder_lstm1 = LSTM(latent_dim, return_sequences=True)
decoder_lstm2 = LSTM(latent_dim, return_sequences=True)
decoder_dense2 = Dense(vocabulary, activation='softmax')

decoder_outputs = encoder_dense1(encoder_outputs)
decoder_outputs = decoder_embedding(decoder_outputs)
decoder_outputs = decoder_lstm1(decoder_outputs)
decoder_outputs = decoder_lstm2(decoder_outputs)
decoder_outputs = Dropout(0.5)(decoder_outputs)
decoder_outputs = decoder_dense2(decoder_outputs)
model = Model(encoder_inputs, decoder_outputs)
model.summary()

Model summary for easy visualizing:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_7 (InputLayer)         (None, 32)                0         
_________________________________________________________________
embedding_13 (Embedding)     (None, 32, 512)           2018816   
_________________________________________________________________
lstm_19 (LSTM)               (None, 32, 512)           2099200   
_________________________________________________________________
dropout_10 (Dropout)         (None, 32, 512)           0         
_________________________________________________________________
dense_19 (Dense)             (None, 32, 3943)          2022759   
_________________________________________________________________
lambda_5 (Lambda)            (None, 32)                0         
_________________________________________________________________
lambda_6 (Lambda)            (None, 32)                0         
_________________________________________________________________
dense_20 (Dense)             (None, 501)               16533     
_________________________________________________________________
embedding_14 (Embedding)     (None, 501, 512)          2018816   
_________________________________________________________________
lstm_20 (LSTM)               (None, 501, 512)          2099200   
_________________________________________________________________
lstm_21 (LSTM)               (None, 501, 512)          2099200   
_________________________________________________________________
dropout_11 (Dropout)         (None, 501, 512)          0         
_________________________________________________________________
dense_21 (Dense)             (None, 501, 3943)         2022759   
=================================================================
Total params: 14,397,283
Trainable params: 14,397,283
Non-trainable params: 0
_________________________________________________________________

I googled for the solution but almost all were about a faulty model. Some recommended to not use functions causing that are causing issues. However, as you can see, I cannot create this model without K.argmax (If you know any other way then do tell me).

Also, how can you even define gradient of argmax! I am guessing its an issue in Keras, if not, pls tell me how to define its gradient.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:21 (3 by maintainers)

github_iconTop GitHub Comments

10reactions
sunwei317commented, Dec 8, 2018

I have the same problem. There is no any problem for train and evaluation, and Ok for saving the model in H5. However, when loading the saved model, the error message pops up: ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. Do you have idea to fix this issue. Otherwise, the model cannot be used for prediction. Thank you.

2reactions
yli192commented, Jan 2, 2020

I monitored the precision, recall and accuracy while training, the model was getting better. If the model was saved with Keras.save, then the error above appears with Keras.load_model. However, if the model was saved with Keras.mode_to_json and Keras.save_weights, then everything is fine when loading the saved model.

I implemented this solution and it worked for me. This is all you will need:

Save model to JSON

model_json = model.to_json() with open(“model.json”, “w”) as json_file: json_file.write(model_json)

serialize weights to HDF5

model.save_weights(“model.h5”) print(“Saved model to disk”)

Load JSON model

json_file = open(‘model.json’, ‘r’) loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json)

load weights into new model

loaded_model.load_weights(“model.h5”) print(“Loaded model from disk”)

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - keras argmax has none for gradients. How to define ...
1 Answer 1 ... For obvious reasons there is no gradient for the Argmax function; How would that even be defined? In order...
Read more >
tf.custom_gradient | TensorFlow v2.11.0
Decorator to define a function with a custom gradient.
Read more >
JAX Frequently Asked Questions (FAQ)
Gradients contain NaN where using where #. If you define a function using where to avoid an undefined value, if you are not...
Read more >
Backend - Keras 2.0.6. Documentation
TensorFlow is an open-source symbolic tensor manipulation framework ... If you have run Keras at least once, you will find the Keras configuration...
Read more >
Gradient for tf.multinomial - Google Groups
ValueError : An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable)....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found