question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I’d like to use T5 to encode my sentences. I see there is a class called T5, but it is deprecated. I also see t5 in the list of basic_transformer_models:

https://github.com/UKPLab/sentence-transformers/blob/7cc10218b038fcda2ebbe5c080ea486d5813b8e8/sentence_transformers/SentenceTransformer.py#L63

Here is what I’ve tried:

from sentence_transformers import SentenceTransformer

all = []

X = [
    'hello how are you',
    'how now brown cow',
]

embedder = SentenceTransformer(model_name_or_path="t5-base" )
e = embedder.encode(X, convert_to_numpy=True)
print(e)

And this is the error:

WARNING:root:No sentence-transformers model found with name /global/home/hpc3552/.cache/torch/sentence_transformers/t5-base. Creating a new one with MEAN pooling.
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Traceback (most recent call last):
  File "/global/home/hpc3552/conversation_analytics/data/get_embeddings_test.py", line 13, in <module>
    e = embedder.encode(X, convert_to_numpy=True)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in encode
    out_features = self.forward(features)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 51, in forward
    output_states = self.auto_model(**trans_features, return_dict=False)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1401, in forward
    decoder_outputs = self.decoder(
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 906, in forward
    raise ValueError(f"You have to specify either {err_msg_prefix}input_ids or {err_msg_prefix}inputs_embeds")
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
nreimerscommented, Jan 19, 2022

Added support for the T5 model in models.Transformer: https://github.com/UKPLab/sentence-transformers/commit/06a38f6c34d1f28f34b38a454fb8764a2e51881b

It is still experimental. Might change with the next commit

I tested different T5 models (t5-small, t5-base, google/t5-v1_1-small, google/t5-v1_1-base).

Here some numbers when training 4 epochs on the STSb datasets: distilbert-base-uncased 83.35 (4 epochs) t5-small 62.66 (4 epochs) t5-small 75.38 (20 epochs) t5-base 71.52 (4 epochs) google/t5-v1_1-small 30.77 (4 epochs) google/t5-v1_1-small 41.73 (20 epochs) google/t5-v1_1-base 43.80 (4 epochs)

So model performance is not really good and quite a high number of training steps are needed to get comparable results to BERT-based models.

Would be happy to hear more results

1reaction
stepthomcommented, Jan 20, 2022

@nreimers With t5-base, the results were much worse than compared with all-mpnet. (I am clustering documents).

I guess this is expected, since all-mpnet was pretrained by you and t5-base was not (and therefore only mean pooling is used).

That leads to a question. Why did you pretrain some models (mpnet, distilbert, etc., as shown on the Pretrained Models, and are available on Hugging Face model Hub) but not others (T5, GPT family, XLNet). What criteria did you use?

Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >
Exploring Transfer Learning with T5: the Text-To-Text Transfer ...
T5 is flexible enough to be easily modified for application to many tasks beyond those considered in our paper, often with great success....
Read more >
The Guide to Multi-Tasking with the T5 Transformer
In this article, we'll be using this technique to train a single T5 model capable of performing the 3 NLP tasks, binary classification, ......
Read more >
T5: Text-To-Text Transfer Transformer - GitHub
t5.models contains shims for connecting T5 Tasks and Mixtures to a model implementation for training, evaluation, and inference. Currently there are ...
Read more >
Exploring Google's T5 Text-To-Text Transformer Model - Wandb
How Does The T5 Transformer Model Work? ... The T5 transformer model has the same standard encoder-decoder structure as standard transformer ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found