question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 batch inference same input data gives different outputs?

See original GitHub issue

❓ Questions & Help

Details

Hi, I am using a T5 model from a tf check point. I have set model.eval(), but every time I forward the exactly same input data, the outputs were always different. here is my code:

import torch
from transformers import *

tokenizer = T5Tokenizer.from_pretrained('t5-base')
config = T5Config.from_pretrained('t5-base')
model = T5ForConditionalGeneration.from_pretrained(
    "../model/t5-base/model.ckpt-1004000", from_tf=True, config=config)
model.eval()

def prediction(documents, query):
    querys = [query] * len(documents)
    encoded_encoder_inputs = tokenizer(documents, padding=True, truncation=True, return_tensors="pt")
    encoded_decoder_inputs = tokenizer(querys, padding=True, truncation=True, return_tensors="pt")
    with torch.no_grad():
        outputs = model(input_ids=encoded_encoder_inputs["input_ids"],
                        labels=encoded_decoder_inputs["input_ids"],
                        attention_mask=encoded_encoder_inputs["attention_mask"])
        batch_logits = outputs[1]
        print(batch_logits)


documents = ['my dog is cute!', "so am I."]
query = "who am I?"
prediction(documents, query)

here is my twice outputs (run the above code twice):

tensor([[[-14.6796,  -6.2236, -13.3517,  ..., -42.7422, -42.7124, -42.8204],
         [-18.5999,  -4.9112, -10.6610,  ..., -40.7506, -40.7313, -40.8373],
         [-17.3894,  -5.3482, -11.4917,  ..., -41.2223, -41.1643, -41.3228],
         [-18.4449,  -5.9145, -12.0056,  ..., -42.3857, -42.3579, -42.4859]],
        [[-15.7967,  -6.9833, -14.5827,  ..., -41.3168, -41.0326, -40.9567],
         [-18.4241,  -5.7193, -12.0748,  ..., -40.1744, -40.0635, -39.9045],
         [-19.5852,  -5.1691, -12.7764,  ..., -42.2655, -42.0788, -41.9885],
         [-20.3673,  -3.6864, -12.5264,  ..., -40.1189, -40.0787, -39.8976]]])
tensor([[[-14.6796,  -6.2236, -13.3517,  ..., -42.7422, -42.7124, -42.8204],
         [-18.5848,  -4.9116, -10.6607,  ..., -40.7443, -40.7251, -40.8300],
         [-17.3248,  -5.3050, -11.4988,  ..., -41.1802, -41.1236, -41.2818],
         [-18.3950,  -5.8967, -11.9756,  ..., -42.3553, -42.3273, -42.4553]],
        [[-15.7967,  -6.9833, -14.5827,  ..., -41.3168, -41.0326, -40.9567],
         [-18.5133,  -5.7466, -12.1301,  ..., -40.2199, -40.1078, -39.9497],
         [-19.3636,  -5.1246, -12.7134,  ..., -42.1881, -41.9993, -41.9106],
         [-20.3247,  -3.6335, -12.4559,  ..., -40.0285, -39.9912, -39.8106]]])

Am I did anything wrong?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
ednussicommented, Oct 29, 2020

Thanks @patrickvonplaten. Was following your suggestion, and after reading through the documentation and code of how the decoder_input_ids is used, it became clear why it affects the logits and helped clear my confusion.

1reaction
ednussicommented, Oct 27, 2020

Hi @ArvinZhuang, Yes, similar to you when I run the model call with same input_ids but different labels I got different outcomes (using pytorch). Realized I missed a part of the print so edited my comment above to match the output. Hoping @patrickvonplaten or someone from the huggingface team can take a second look, so we can get to the bottom of this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >
Data to Text generation with T5; Building a simple yet ...
The T5 allows us to use the same model along with the loss function and ... Now let's take a look at the...
Read more >
Output logits from T5 model for text generation purposes
How can I output the logits of the T5 model directly given a text input for generation purposes (not training)?. I want to...
Read more >
Neural machine translation with a Transformer and Keras | Text
A Transformer is a sequence-to-sequence encoder-decoder model similar to the ... in the data between distant positions in the input or output sequences....
Read more >
Exploring the Limits of Transfer Learning with a ... - YouTube
This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found