Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to use T5?

See original GitHub issue

I’d like to use T5 to encode my sentences. I see there is a class called T5, but it is deprecated. I also see t5 in the list of basic_transformer_models:

https://github.com/UKPLab/sentence-transformers/blob/7cc10218b038fcda2ebbe5c080ea486d5813b8e8/sentence_transformers/SentenceTransformer.py#L63

Here is what I’ve tried:

from sentence_transformers import SentenceTransformer

all = []

X = [
    'hello how are you',
    'how now brown cow',
]

embedder = SentenceTransformer(model_name_or_path="t5-base" )
e = embedder.encode(X, convert_to_numpy=True)
print(e)

And this is the error:

WARNING:root:No sentence-transformers model found with name /global/home/hpc3552/.cache/torch/sentence_transformers/t5-base. Creating a new one with MEAN pooling.
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Traceback (most recent call last):
  File "/global/home/hpc3552/conversation_analytics/data/get_embeddings_test.py", line 13, in <module>
    e = embedder.encode(X, convert_to_numpy=True)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in encode
    out_features = self.forward(features)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 51, in forward
    output_states = self.auto_model(**trans_features, return_dict=False)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1401, in forward
    decoder_outputs = self.decoder(
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 906, in forward
    raise ValueError(f"You have to specify either {err_msg_prefix}input_ids or {err_msg_prefix}inputs_embeds")
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

3reactions

nreimerscommented, Jan 19, 2022

Added support for the T5 model in models.Transformer: https://github.com/UKPLab/sentence-transformers/commit/06a38f6c34d1f28f34b38a454fb8764a2e51881b

It is still experimental. Might change with the next commit

I tested different T5 models (t5-small, t5-base, google/t5-v1_1-small, google/t5-v1_1-base).

Here some numbers when training 4 epochs on the STSb datasets: distilbert-base-uncased 83.35 (4 epochs) t5-small 62.66 (4 epochs) t5-small 75.38 (20 epochs) t5-base 71.52 (4 epochs) google/t5-v1_1-small 30.77 (4 epochs) google/t5-v1_1-small 41.73 (20 epochs) google/t5-v1_1-base 43.80 (4 epochs)

So model performance is not really good and quite a high number of training steps are needed to get comparable results to BERT-based models.

Would be happy to hear more results

1reaction

stepthomcommented, Jan 20, 2022

@nreimers With t5-base, the results were much worse than compared with all-mpnet. (I am clustering documents).

I guess this is expected, since all-mpnet was pretrained by you and t5-base was not (and therefore only mean pooling is used).

That leads to a question. Why did you pretrain some models (mpnet, distilbert, etc., as shown on the Pretrained Models, and are available on Hugging Face model Hub) but not others (T5, GPT family, XLNet). What criteria did you use?

Thanks!

Top Results From Across the Web

T5 - Hugging Face

T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...

Exploring Transfer Learning with T5: the Text-To-Text Transfer ...

T5 is flexible enough to be easily modified for application to many tasks beyond those considered in our paper, often with great success....

The Guide to Multi-Tasking with the T5 Transformer

In this article, we'll be using this technique to train a single T5 model capable of performing the 3 NLP tasks, binary classification, ......

T5: Text-To-Text Transfer Transformer - GitHub

t5.models contains shims for connecting T5 Tasks and Mixtures to a model implementation for training, evaluation, and inference. Currently there are ...

Exploring Google's T5 Text-To-Text Transformer Model - Wandb

How Does The T5 Transformer Model Work? ... The T5 transformer model has the same standard encoder-decoder structure as standard transformer ...