question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: You have to specify either input_ids or inputs_embeds!

See original GitHub issue

Details

I’m quite new to NLP task. However, I was trying to train the T5-large model and set things as follows. But unfortunately, I’ve got an error.

def build_model(transformer, max_len=512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
    model = Model(inputs=input_word_ids, outputs=out)
    return model

model = build_model(transformer_layer, max_len=MAX_LEN)

It thorws

ValueError: in converted code:
ValueError                                Traceback (most recent call last)
<ipython-input-19-8ad6e68cd3f5> in <module>
----> 5     model = build_model(transformer_layer, max_len=MAX_LEN)
      6 
      7 model.summary()

<ipython-input-17-e001ed832ed6> in build_model(transformer, max_len)
     31     """
     32     input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
---> 33     sequence_output = transformer(input_word_ids)[0]
     34     cls_token = sequence_output[:, 0, :]
     35     out = Dense(1, activation='sigmoid')(cls_token)
ValueError: You have to specify either input_ids or inputs_embeds

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:21 (17 by maintainers)

github_iconTop GitHub Comments

2reactions
patrickvonplatencommented, May 16, 2020

@ratthachat - thanks for you message! We definitely need to provide more TF examples for the T5 Model. I want to tackle this problem in ~2 weeks.

In TF we use the naming convention inputs, so the you should change to model.fit({"inputs": x_encoder}) . I very much agree that the error message is quite misleading and correct it in this PR: #4401.

2reactions
patrickvonplatencommented, Apr 5, 2020

I’m not 100% sure what you want to do here exactly. T5 is always trained in a text-to-text format. We have a section here on how to train T5: https://huggingface.co/transformers/model_doc/t5.html#training

Otherwise I’d recommend taking a look at the official paper.

Read more comments on GitHub >

github_iconTop Results From Across the Web

I get a "You have to specify either input_ids or inputs_embeds ...
My input sequence is unconstrained (any sentence), and my output sequence is formal language that resembles assembly.
Read more >
"You have to specify either input_ids or inputs_embeds", but I ...
The problem is that there's probably a renaming procedure in the code, since we use a encoder-decoder architecture we have 2 types of...
Read more >
modeling_tf_t5.py - CodaLab Worksheets
... raise ValueError("You have to specify either input_ids or inputs_embeds") if inputs_embeds is None: assert self.embed_tokens is not None, "You have to ...
Read more >
bert-base-chinese-for-tnews - Kaggle
... raise ValueError("You have to specify either input_ids or inputs_embeds") device = input_ids.device if input_ids is not None else inputs_embeds.device ...
Read more >
BERT Inner Workings - TOPBOTS
I believe it's easy to follow along if you have the code next to the ... have to specify either input_ids or inputs_embeds")...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found