BERT fails on TPU when jit_compile=True
See original GitHub issueSee this gist for the repro. https://colab.sandbox.google.com/gist/mattdangerw/7b8ba7f942ea9e8e912b4e21cdf9bbf9/tpu-bug-jit_compile-true.ipynb
TPU training works fine when jit_compile=False
. When jit_compile=True
we hit a weird bug.
InvalidArgumentError: 9 root error(s) found.
(0) INVALID_ARGUMENT: {{function_node __inference_train_function_16010}} Reshape's input dynamic dimension is decomposed into multiple output dynamic dimensions, but the constraint is ambiguous and XLA can't infer the output dimension %reshape.21 = f32[16,128,512]{2,1,0} reshape(f32[2048,512]{1,0} %transpose.6), metadata={op_type="Reshape" op_name="bert_classifier/backbone/transformer_layer_0/dense/Tensordot"}.
[[{{node TPUReplicate/_compile/_8571286824872389582/_6}}]]
Issue Analytics
- State:
- Created a year ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
BERT-joint TF1 baseline fails on TPU training #34155 - GitHub
This may be due to a preemption in a connected worker or parameter server. The current session will be closed and a new...
Read more >Error loading pretrained BERT on TPU using Keras
I am working with TensorFlow and Keras on a TPU and I need to load the pretrained BERT model and convert it to...
Read more >Solve GLUE tasks using BERT on TPU | Text - TensorFlow
BERT can be used to solve many problems in natural language processing. You will learn how to fine-tune BERT for many tasks from...
Read more >BERT Fine Tuning with Cloud TPU
This tutorial shows you how to train the Bidirectional Encoder Representations from Transformers (BERT) model on Cloud TPU. BERT is a method of...
Read more >Choosing the right parameters for pre-training BERT using TPU
Pre-training a BERT model is not easy and many articles out there give a ... inadvertently increase rather than decrease the training error....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, we should be good!
@chenmoneygithub thanks! So basically we can always pass
jit_compile=True
here, and the option will be ignored by core Keras?