Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: You have to specify either input_ids or inputs_embeds!

See original GitHub issue

Details

I’m quite new to NLP task. However, I was trying to train the T5-large model and set things as follows. But unfortunately, I’ve got an error.

def build_model(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
    model = Model(inputs=input_word_ids, outputs=out)
    return model

model = build_model(transformer_layer, max_len=MAX_LEN)

It thorws

ValueError: in converted code:
ValueError                                Traceback (most recent call last)
<ipython-input-19-8ad6e68cd3f5> in <module>
----> 5     model = build_model(transformer_layer, max_len=MAX_LEN)
      6 
      7 model.summary()

<ipython-input-17-e001ed832ed6> in build_model(transformer, max_len)
     31     """
     32     input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
---> 33     sequence_output = transformer(input_word_ids)[0]
     34     cls_token = sequence_output[:, 0, :]
     35     out = Dense(1, activation='sigmoid')(cls_token)
ValueError: You have to specify either input_ids or inputs_embeds

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:21 (17 by maintainers)

Top GitHub Comments

2reactions

patrickvonplatencommented, May 16, 2020

@ratthachat - thanks for you message! We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

In TF we use the naming convention inputs, so the you should change to model.fit({"inputs": x_encoder}) . I very much agree that the error message is quite misleading and correct it in this PR: #4401.

2reactions

patrickvonplatencommented, Apr 5, 2020

I’m not 100% sure what you want to do here exactly. T5 is always trained in a text-to-text format. We have a section here on how to train T5: https://huggingface.co/transformers/model_doc/t5.html#training

Otherwise I’d recommend taking a look at the official paper.

Top Results From Across the Web

I get a "You have to specify either input_ids or inputs_embeds ...

My input sequence is unconstrained (any sentence), and my output sequence is formal language that resembles assembly.

"You have to specify either input_ids or inputs_embeds", but I ...

The problem is that there's probably a renaming procedure in the code, since we use a encoder-decoder architecture we have 2 types of...

modeling_tf_t5.py - CodaLab Worksheets

... raise ValueError("You have to specify either input_ids or inputs_embeds") if inputs_embeds is None: assert self.embed_tokens is not None, "You have to ...

bert-base-chinese-for-tnews - Kaggle

... raise ValueError("You have to specify either input_ids or inputs_embeds") device = input_ids.device if input_ids is not None else inputs_embeds.device ...

BERT Inner Workings - TOPBOTS

I believe it's easy to follow along if you have the code next to the ... have to specify either input_ids or inputs_embeds")...