T5 training with Keras: InvalidArgumentError: logits and labels must have the same first dimension
See original GitHub issueEnvironment info
transformers
version: 4.3.0.dev0- Platform: Linux version 4.19.0-14-cloud-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.171-2 (2021-01-30)
- Python version: 3.7
- PyTorch version (GPU?): 1.7.1, No
- Tensorflow version (GPU?): 2.3.1, No
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help
@patrickvonplaten, @patil-suraj
Information
Model I am using (Bert, XLNet …): T5
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
See code below:
import numpy as np
import tensorflow as tf
from transformers import T5TokenizerFast, TFT5ForConditionalGeneration
MODEL_NAME = "t5-small"
INPUT_TEXTS = [
"When Liana Barrientos was 23 years old, she got married in Westchester County, New York.",
"Only 18 days after that marriage, she got hitched yet again.",
"Then, Barrientos declared 'I do' five more times, sometimes only within two weeks of each other.",
"In 2010, she married once more, this time in the Bronx.",
"In an application for a marriage license, she stated it was her 'first and only' marriage.",
"Prosecutors said the marriages were part of an immigration scam.",
"In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002.",
"All occurred either in Westchester County, Long Island, New Jersey or the Bronx.",
"Any divorces happened only after such filings were approved.",
"It was unclear whether any of the men will be prosecuted.",
]
LABEL_TEXTS = ["Yes", "No", "Yes", "Yes", "No", "Yes", "No", "No", "Well, you never know, right?", "Yes"]
tokenizer = T5TokenizerFast.from_pretrained(MODEL_NAME)
tokenized_inputs = tokenizer(INPUT_TEXTS, padding="max_length", truncation=True, return_tensors="tf")
tokenized_labels = tokenizer(LABEL_TEXTS, padding="max_length", truncation=True, return_tensors="tf")
decoder_input_texts = ["<pad> " + _txt for _txt in LABEL_TEXTS]
tokenized_decoder_inputs = tokenizer(decoder_input_texts, padding="max_length", truncation=True, return_tensors="tf")
def add_dec_inp_ids(_features, _labels, _dec_inp_ids):
_features["decoder_input_ids"] = _dec_inp_ids
return (_features, _labels)
ds = tf.data.Dataset.from_tensor_slices(
(tokenized_inputs.data, tokenized_decoder_inputs.input_ids, tokenized_labels.input_ids))\
.map(add_dec_inp_ids)
batch_size = 2
steps_per_epoch = np.ceil(len(INPUT_TEXTS) / batch_size)
train_ds = ds.repeat().prefetch(tf.data.experimental.AUTOTUNE).batch(batch_size)
model = TFT5ForConditionalGeneration.from_pretrained(MODEL_NAME)
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss)
model.fit(train_ds, epochs=2, steps_per_epoch=steps_per_epoch)
And what I get:
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [8192,64] and labels shape [1024]
[[node sparse_categorical_crossentropy_4/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-2-0152c3165ef3>:51) ]] [Op:__inference_train_function_27365]
Expected behavior
Training starts and then finishes without error.
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (8 by maintainers)
Top Results From Across the Web
Tensorflow : logits and labels must have the same first ...
The first (batch) dimension needs to match but once labels is reshaped to a 1D vector, if the first dimension of logits and...
Read more >Logits and labels must have the same first dimension - Morioh
Tensorflow: Logits and labels must have the same first dimension. I'm new to machine learning in TF. I have this dataset which I...
Read more >logits and labels must have the same first dimension : r ...
I get an error InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [25088,10] and labels shape [32] ...
Read more >Fine-tuning T5 on Tensorflow - Beginners
In this project, Lewis propose to use T5 and the JFleg datasets. ... Invalid argument: logits and labels must have the same first...
Read more >ValueError: `logits` and `labels` must have the same shape ...
Hi, I'm training two networks with PyKeras and i obtain an error: ValueError: logitsandlabels must have the same shape, received ((None, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hello!
For now T5 cannot be trained with usual
.compile()
and.fit()
methods (such as multiple other models but we are currently working on this). You have to either use the TFTrainer or to update yourself the behavior of the internal training loop of Keras. An example of how to deal with T5 and properly training it, is showed in this nice Colab.This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.