Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 training with Keras: InvalidArgumentError: logits and labels must have the same first dimension

See original GitHub issue

Environment info

transformers version: 4.3.0.dev0
Platform: Linux version 4.19.0-14-cloud-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.171-2 (2021-01-30)
Python version: 3.7
PyTorch version (GPU?): 1.7.1, No
Tensorflow version (GPU?): 2.3.1, No
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help

@patrickvonplaten, @patil-suraj

Information

Model I am using (Bert, XLNet …): T5

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

See code below:

import numpy as np
import tensorflow as tf
from transformers import T5TokenizerFast, TFT5ForConditionalGeneration

MODEL_NAME = "t5-small"
INPUT_TEXTS = [
    "When Liana Barrientos was 23 years old, she got married in Westchester County, New York.",
    "Only 18 days after that marriage, she got hitched yet again.",
    "Then, Barrientos declared 'I do' five more times, sometimes only within two weeks of each other.",
    "In 2010, she married once more, this time in the Bronx.",
    "In an application for a marriage license, she stated it was her 'first and only' marriage.",
    "Prosecutors said the marriages were part of an immigration scam.",
    "In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002.",
    "All occurred either in Westchester County, Long Island, New Jersey or the Bronx.",
    "Any divorces happened only after such filings were approved.",
    "It was unclear whether any of the men will be prosecuted.",
]
LABEL_TEXTS = ["Yes", "No", "Yes", "Yes", "No", "Yes", "No", "No", "Well, you never know, right?", "Yes"]

tokenizer = T5TokenizerFast.from_pretrained(MODEL_NAME)

tokenized_inputs = tokenizer(INPUT_TEXTS, padding="max_length", truncation=True, return_tensors="tf")
tokenized_labels = tokenizer(LABEL_TEXTS, padding="max_length", truncation=True, return_tensors="tf")

decoder_input_texts = ["<pad> " + _txt for _txt in LABEL_TEXTS]
tokenized_decoder_inputs = tokenizer(decoder_input_texts, padding="max_length", truncation=True, return_tensors="tf")


def add_dec_inp_ids(_features, _labels, _dec_inp_ids):
    _features["decoder_input_ids"] = _dec_inp_ids
    return (_features, _labels)


ds = tf.data.Dataset.from_tensor_slices(
    (tokenized_inputs.data, tokenized_decoder_inputs.input_ids, tokenized_labels.input_ids))\
    .map(add_dec_inp_ids)

batch_size = 2
steps_per_epoch = np.ceil(len(INPUT_TEXTS) / batch_size)

train_ds = ds.repeat().prefetch(tf.data.experimental.AUTOTUNE).batch(batch_size)

model = TFT5ForConditionalGeneration.from_pretrained(MODEL_NAME)
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

model.compile(optimizer=optimizer, loss=loss)
model.fit(train_ds, epochs=2, steps_per_epoch=steps_per_epoch)

And what I get:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  logits and labels must have the same first dimension, got logits shape [8192,64] and labels shape [1024]
	 [[node sparse_categorical_crossentropy_4/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-2-0152c3165ef3>:51) ]] [Op:__inference_train_function_27365]

Expected behavior

Training starts and then finishes without error.

Issue Analytics

State:
Created 3 years ago
Comments:12 (8 by maintainers)

Top GitHub Comments

2reactions

jplucommented, Feb 14, 2021

Hello!

For now T5 cannot be trained with usual .compile() and .fit() methods (such as multiple other models but we are currently working on this). You have to either use the TFTrainer or to update yourself the behavior of the internal training loop of Keras. An example of how to deal with T5 and properly training it, is showed in this nice Colab.

0reactions

github-actions[bot]commented, Apr 14, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.