"Session Crashed for Unknown Reason" in beam_evaluate_sentence() networks_seq2seq_nmt.ipynb
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): colab instance
- TensorFlow version and how it was installed (source or binary): TF 2.4.0 from colab
- TensorFlow-Addons version and how it was installed (source or binary): pip install tensorflow-addons==0.12.0
- Python version: 3.6.9
- Is GPU used? (yes/no): yes
Describe the bug Got notebook error “Session crashed for unknown reason” when executing cell calling beam_evaluate_sentence(). This code is from docs/tutorials/networks_seq2seq_nmt.ipynb. The notebook worked fine for tfa v0.11.2 but it now crashes with TF 2.4.0 paired with v0.12.0
Code to reproduce the issue
def beam_evaluate_sentence(sentence, beam_width=3):
sentence = dataset_creator.preprocess_sentence(sentence)
inputs = [inp_lang.word_index[i] for i in sentence.split(' ')]
inputs = tf.keras.preprocessing.sequence.pad_sequences([inputs],
maxlen=max_length_input,
padding='post')
inputs = tf.convert_to_tensor(inputs)
inference_batch_size = inputs.shape[0]
result = ''
enc_start_state = [tf.zeros((inference_batch_size, units)), tf.zeros((inference_batch_size,units))]
enc_out, enc_h, enc_c = encoder(inputs, enc_start_state)
dec_h = enc_h
dec_c = enc_c
start_tokens = tf.fill([inference_batch_size], targ_lang.word_index['<start>'])
end_token = targ_lang.word_index['<end>']
# From official documentation
# NOTE If you are using the BeamSearchDecoder with a cell wrapped in AttentionWrapper, then you must ensure that:
# The encoder output has been tiled to beam_width via tfa.seq2seq.tile_batch (NOT tf.tile).
# The batch_size argument passed to the get_initial_state method of this wrapper is equal to true_batch_size * beam_width.
# The initial state created with get_initial_state above contains a cell_state value containing properly tiled final state from the encoder.
enc_out = tfa.seq2seq.tile_batch(enc_out, multiplier=beam_width)
decoder.attention_mechanism.setup_memory(enc_out)
print("beam_with * [batch_size, max_length_input, rnn_units] : 3 * [1, 16, 1024]] :", enc_out.shape)
# set decoder_inital_state which is an AttentionWrapperState considering beam_width
hidden_state = tfa.seq2seq.tile_batch([enc_h, enc_c], multiplier=beam_width)
decoder_initial_state = decoder.rnn_cell.get_initial_state(batch_size=beam_width*inference_batch_size, dtype=tf.float32)
decoder_initial_state = decoder_initial_state.clone(cell_state=hidden_state)
# Instantiate BeamSearchDecoder
decoder_instance = tfa.seq2seq.BeamSearchDecoder(decoder.rnn_cell,beam_width=beam_width, output_layer=decoder.fc)
decoder_embedding_matrix = decoder.embedding.variables[0]
# The BeamSearchDecoder object's call() function takes care of everything.
outputs, final_state, sequence_lengths = decoder_instance(decoder_embedding_matrix, start_tokens=start_tokens, end_token=end_token, initial_state=decoder_initial_state)
# outputs is tfa.seq2seq.FinalBeamSearchDecoderOutput object.
# The final beam predictions are stored in outputs.predicted_id
# outputs.beam_search_decoder_output is a tfa.seq2seq.BeamSearchDecoderOutput object which keep tracks of beam_scores and parent_ids while performing a beam decoding step
# final_state = tfa.seq2seq.BeamSearchDecoderState object.
# Sequence Length = [inference_batch_size, beam_width] details the maximum length of the beams that are generated
# outputs.predicted_id.shape = (inference_batch_size, time_step_outputs, beam_width)
# outputs.beam_search_decoder_output.scores.shape = (inference_batch_size, time_step_outputs, beam_width)
# Convert the shape of outputs and beam_scores to (inference_batch_size, beam_width, time_step_outputs)
final_outputs = tf.transpose(outputs.predicted_ids, perm=(0,2,1))
beam_scores = tf.transpose(outputs.beam_search_decoder_output.scores, perm=(0,2,1))
return final_outputs.numpy(), beam_scores.numpy()
def beam_translate(sentence):
result, beam_scores = beam_evaluate_sentence(sentence)
print(result.shape, beam_scores.shape)
for beam, score in zip(result, beam_scores):
print(beam.shape, score.shape)
output = targ_lang.sequences_to_texts(beam)
output = [a[:a.index('<end>')] for a in output]
beam_score = [a.sum() for a in score]
print('Input: %s' % (sentence))
for i in range(len(output)):
print('{} Predicted translation: {} {}'.format(i+1, output[i], beam_score[i]))
beam_translate(u'hace mucho frio aqui.')
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
Google Colab - Your session crashed for an unknown reason
Google Colab is crashing because you are trying to Run Code related to GPU with Runtime as CPU . The execution is successful...
Read more >Session crashes for an unknown reason on graph dataset
It prints out "session crashed for an unknown reason". I don't know what is wrong with it. I was using CPU, then I...
Read more >[solved] Your session crashed after using all available RAM ...
Your session crashed after using all available RAM - Google Colab ... colab by saving it in drive and crashing it with the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I can confirm it’s now okay with tf 2.4.1 and tfa 0.12.1. Feel free to reopen if the problem still exists.
Yes probably they have a custom build for colab with Cuda 10
The CUDA 11 ticket was at https://github.com/googlecolab/colabtools/issues/1574