question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BERT fails on TPU when jit_compile=True

See original GitHub issue

See this gist for the repro. https://colab.sandbox.google.com/gist/mattdangerw/7b8ba7f942ea9e8e912b4e21cdf9bbf9/tpu-bug-jit_compile-true.ipynb

TPU training works fine when jit_compile=False. When jit_compile=True we hit a weird bug.

InvalidArgumentError: 9 root error(s) found.
  (0) INVALID_ARGUMENT: {{function_node __inference_train_function_16010}} Reshape's input dynamic dimension is decomposed into multiple output dynamic dimensions, but the constraint is ambiguous and XLA can't infer the output dimension %reshape.21 = f32[16,128,512]{2,1,0} reshape(f32[2048,512]{1,0} %transpose.6), metadata={op_type="Reshape" op_name="bert_classifier/backbone/transformer_layer_0/dense/Tensordot"}. 
	 [[{{node TPUReplicate/_compile/_8571286824872389582/_6}}]]

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
chenmoneygithubcommented, Dec 5, 2022

Yes, we should be good!

0reactions
mattdangerwcommented, Dec 5, 2022

@chenmoneygithub thanks! So basically we can always pass jit_compile=True here, and the option will be ignored by core Keras?

Read more comments on GitHub >

github_iconTop Results From Across the Web

BERT-joint TF1 baseline fails on TPU training #34155 - GitHub
This may be due to a preemption in a connected worker or parameter server. The current session will be closed and a new...
Read more >
Error loading pretrained BERT on TPU using Keras
I am working with TensorFlow and Keras on a TPU and I need to load the pretrained BERT model and convert it to...
Read more >
Solve GLUE tasks using BERT on TPU | Text - TensorFlow
BERT can be used to solve many problems in natural language processing. You will learn how to fine-tune BERT for many tasks from...
Read more >
BERT Fine Tuning with Cloud TPU
This tutorial shows you how to train the Bidirectional Encoder Representations from Transformers (BERT) model on Cloud TPU. BERT is a method of...
Read more >
Choosing the right parameters for pre-training BERT using TPU
Pre-training a BERT model is not easy and many articles out there give a ... inadvertently increase rather than decrease the training error....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found