T5 batch inference same input data gives different outputs?
See original GitHub issue❓ Questions & Help
Details
Hi, I am using a T5 model from a tf check point. I have set model.eval(), but every time I forward the exactly same input data, the outputs were always different. here is my code:
import torch
from transformers import *
tokenizer = T5Tokenizer.from_pretrained('t5-base')
config = T5Config.from_pretrained('t5-base')
model = T5ForConditionalGeneration.from_pretrained(
"../model/t5-base/model.ckpt-1004000", from_tf=True, config=config)
model.eval()
def prediction(documents, query):
querys = [query] * len(documents)
encoded_encoder_inputs = tokenizer(documents, padding=True, truncation=True, return_tensors="pt")
encoded_decoder_inputs = tokenizer(querys, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(input_ids=encoded_encoder_inputs["input_ids"],
labels=encoded_decoder_inputs["input_ids"],
attention_mask=encoded_encoder_inputs["attention_mask"])
batch_logits = outputs[1]
print(batch_logits)
documents = ['my dog is cute!', "so am I."]
query = "who am I?"
prediction(documents, query)
here is my twice outputs (run the above code twice):
tensor([[[-14.6796, -6.2236, -13.3517, ..., -42.7422, -42.7124, -42.8204],
[-18.5999, -4.9112, -10.6610, ..., -40.7506, -40.7313, -40.8373],
[-17.3894, -5.3482, -11.4917, ..., -41.2223, -41.1643, -41.3228],
[-18.4449, -5.9145, -12.0056, ..., -42.3857, -42.3579, -42.4859]],
[[-15.7967, -6.9833, -14.5827, ..., -41.3168, -41.0326, -40.9567],
[-18.4241, -5.7193, -12.0748, ..., -40.1744, -40.0635, -39.9045],
[-19.5852, -5.1691, -12.7764, ..., -42.2655, -42.0788, -41.9885],
[-20.3673, -3.6864, -12.5264, ..., -40.1189, -40.0787, -39.8976]]])
tensor([[[-14.6796, -6.2236, -13.3517, ..., -42.7422, -42.7124, -42.8204],
[-18.5848, -4.9116, -10.6607, ..., -40.7443, -40.7251, -40.8300],
[-17.3248, -5.3050, -11.4988, ..., -41.1802, -41.1236, -41.2818],
[-18.3950, -5.8967, -11.9756, ..., -42.3553, -42.3273, -42.4553]],
[[-15.7967, -6.9833, -14.5827, ..., -41.3168, -41.0326, -40.9567],
[-18.5133, -5.7466, -12.1301, ..., -40.2199, -40.1078, -39.9497],
[-19.3636, -5.1246, -12.7134, ..., -42.1881, -41.9993, -41.9106],
[-20.3247, -3.6335, -12.4559, ..., -40.0285, -39.9912, -39.8106]]])
Am I did anything wrong?
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (4 by maintainers)
Top Results From Across the Web
T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >Data to Text generation with T5; Building a simple yet ...
The T5 allows us to use the same model along with the loss function and ... Now let's take a look at the...
Read more >Output logits from T5 model for text generation purposes
How can I output the logits of the T5 model directly given a text input for generation purposes (not training)?. I want to...
Read more >Neural machine translation with a Transformer and Keras | Text
A Transformer is a sequence-to-sequence encoder-decoder model similar to the ... in the data between distant positions in the input or output sequences....
Read more >Exploring the Limits of Transfer Learning with a ... - YouTube
This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks @patrickvonplaten. Was following your suggestion, and after reading through the documentation and code of how the
decoder_input_ids
is used, it became clear why it affects thelogits
and helped clear my confusion.Hi @ArvinZhuang, Yes, similar to you when I run the model call with same input_ids but different labels I got different outcomes (using pytorch). Realized I missed a part of the print so edited my comment above to match the output. Hoping @patrickvonplaten or someone from the
huggingface
team can take a second look, so we can get to the bottom of this.