How to use T5?
See original GitHub issueI’d like to use T5 to encode my sentences. I see there is a class called T5, but it is deprecated. I also see t5 in the list of basic_transformer_models
:
Here is what I’ve tried:
from sentence_transformers import SentenceTransformer
all = []
X = [
'hello how are you',
'how now brown cow',
]
embedder = SentenceTransformer(model_name_or_path="t5-base" )
e = embedder.encode(X, convert_to_numpy=True)
print(e)
And this is the error:
WARNING:root:No sentence-transformers model found with name /global/home/hpc3552/.cache/torch/sentence_transformers/t5-base. Creating a new one with MEAN pooling.
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Traceback (most recent call last):
File "/global/home/hpc3552/conversation_analytics/data/get_embeddings_test.py", line 13, in <module>
e = embedder.encode(X, convert_to_numpy=True)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in encode
out_features = self.forward(features)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 51, in forward
output_states = self.auto_model(**trans_features, return_dict=False)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1401, in forward
decoder_outputs = self.decoder(
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/global/home/hpc3552/conversation_analytics/ca_env5/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 906, in forward
raise ValueError(f"You have to specify either {err_msg_prefix}input_ids or {err_msg_prefix}inputs_embeds")
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >Exploring Transfer Learning with T5: the Text-To-Text Transfer ...
T5 is flexible enough to be easily modified for application to many tasks beyond those considered in our paper, often with great success....
Read more >The Guide to Multi-Tasking with the T5 Transformer
In this article, we'll be using this technique to train a single T5 model capable of performing the 3 NLP tasks, binary classification, ......
Read more >T5: Text-To-Text Transfer Transformer - GitHub
t5.models contains shims for connecting T5 Tasks and Mixtures to a model implementation for training, evaluation, and inference. Currently there are ...
Read more >Exploring Google's T5 Text-To-Text Transformer Model - Wandb
How Does The T5 Transformer Model Work? ... The T5 transformer model has the same standard encoder-decoder structure as standard transformer ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Added support for the T5 model in models.Transformer: https://github.com/UKPLab/sentence-transformers/commit/06a38f6c34d1f28f34b38a454fb8764a2e51881b
It is still experimental. Might change with the next commit
I tested different T5 models (t5-small, t5-base, google/t5-v1_1-small, google/t5-v1_1-base).
Here some numbers when training 4 epochs on the STSb datasets: distilbert-base-uncased 83.35 (4 epochs) t5-small 62.66 (4 epochs) t5-small 75.38 (20 epochs) t5-base 71.52 (4 epochs) google/t5-v1_1-small 30.77 (4 epochs) google/t5-v1_1-small 41.73 (20 epochs) google/t5-v1_1-base 43.80 (4 epochs)
So model performance is not really good and quite a high number of training steps are needed to get comparable results to BERT-based models.
Would be happy to hear more results
@nreimers With t5-base, the results were much worse than compared with all-mpnet. (I am clustering documents).
I guess this is expected, since all-mpnet was pretrained by you and t5-base was not (and therefore only mean pooling is used).
That leads to a question. Why did you pretrain some models (mpnet, distilbert, etc., as shown on the Pretrained Models, and are available on Hugging Face model Hub) but not others (T5, GPT family, XLNet). What criteria did you use?
Thanks!