question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 training with Keras: InvalidArgumentError: logits and labels must have the same first dimension

See original GitHub issue

Environment info

  • transformers version: 4.3.0.dev0
  • Platform: Linux version 4.19.0-14-cloud-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.171-2 (2021-01-30)
  • Python version: 3.7
  • PyTorch version (GPU?): 1.7.1, No
  • Tensorflow version (GPU?): 2.3.1, No
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

@patrickvonplaten, @patil-suraj

Information

Model I am using (Bert, XLNet …): T5

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

See code below:

import numpy as np
import tensorflow as tf
from transformers import T5TokenizerFast, TFT5ForConditionalGeneration

MODEL_NAME = "t5-small"
INPUT_TEXTS = [
    "When Liana Barrientos was 23 years old, she got married in Westchester County, New York.",
    "Only 18 days after that marriage, she got hitched yet again.",
    "Then, Barrientos declared 'I do' five more times, sometimes only within two weeks of each other.",
    "In 2010, she married once more, this time in the Bronx.",
    "In an application for a marriage license, she stated it was her 'first and only' marriage.",
    "Prosecutors said the marriages were part of an immigration scam.",
    "In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002.",
    "All occurred either in Westchester County, Long Island, New Jersey or the Bronx.",
    "Any divorces happened only after such filings were approved.",
    "It was unclear whether any of the men will be prosecuted.",
]
LABEL_TEXTS = ["Yes", "No", "Yes", "Yes", "No", "Yes", "No", "No", "Well, you never know, right?", "Yes"]

tokenizer = T5TokenizerFast.from_pretrained(MODEL_NAME)

tokenized_inputs = tokenizer(INPUT_TEXTS, padding="max_length", truncation=True, return_tensors="tf")
tokenized_labels = tokenizer(LABEL_TEXTS, padding="max_length", truncation=True, return_tensors="tf")

decoder_input_texts = ["<pad> " + _txt for _txt in LABEL_TEXTS]
tokenized_decoder_inputs = tokenizer(decoder_input_texts, padding="max_length", truncation=True, return_tensors="tf")


def add_dec_inp_ids(_features, _labels, _dec_inp_ids):
    _features["decoder_input_ids"] = _dec_inp_ids
    return (_features, _labels)


ds = tf.data.Dataset.from_tensor_slices(
    (tokenized_inputs.data, tokenized_decoder_inputs.input_ids, tokenized_labels.input_ids))\
    .map(add_dec_inp_ids)

batch_size = 2
steps_per_epoch = np.ceil(len(INPUT_TEXTS) / batch_size)

train_ds = ds.repeat().prefetch(tf.data.experimental.AUTOTUNE).batch(batch_size)

model = TFT5ForConditionalGeneration.from_pretrained(MODEL_NAME)
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

model.compile(optimizer=optimizer, loss=loss)
model.fit(train_ds, epochs=2, steps_per_epoch=steps_per_epoch)

And what I get:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  logits and labels must have the same first dimension, got logits shape [8192,64] and labels shape [1024]
	 [[node sparse_categorical_crossentropy_4/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-2-0152c3165ef3>:51) ]] [Op:__inference_train_function_27365]

Expected behavior

Training starts and then finishes without error.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
jplucommented, Feb 14, 2021

Hello!

For now T5 cannot be trained with usual .compile() and .fit() methods (such as multiple other models but we are currently working on this). You have to either use the TFTrainer or to update yourself the behavior of the internal training loop of Keras. An example of how to deal with T5 and properly training it, is showed in this nice Colab.

0reactions
github-actions[bot]commented, Apr 14, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tensorflow : logits and labels must have the same first ...
The first (batch) dimension needs to match but once labels is reshaped to a 1D vector, if the first dimension of logits and...
Read more >
Logits and labels must have the same first dimension - Morioh
Tensorflow: Logits and labels must have the same first dimension. I'm new to machine learning in TF. I have this dataset which I...
Read more >
logits and labels must have the same first dimension : r ...
I get an error InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [25088,10] and labels shape [32] ...
Read more >
Fine-tuning T5 on Tensorflow - Beginners
In this project, Lewis propose to use T5 and the JFleg datasets. ... Invalid argument: logits and labels must have the same first...
Read more >
ValueError: `logits` and `labels` must have the same shape ...
Hi, I'm training two networks with PyKeras and i obtain an error: ValueError: logitsandlabels must have the same shape, received ((None, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found