Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 batch inference same input data gives different outputs?

See original GitHub issue

❓ Questions & Help

Details

Hi, I am using a T5 model from a tf check point. I have set model.eval(), but every time I forward the exactly same input data, the outputs were always different. here is my code:

import torch
from transformers import *

tokenizer = T5Tokenizer.from_pretrained('t5-base')
config = T5Config.from_pretrained('t5-base')
model = T5ForConditionalGeneration.from_pretrained(
    "../model/t5-base/model.ckpt-1004000", from_tf=True, config=config)
model.eval()

def prediction(documents, query):
    querys = [query] * len(documents)
    encoded_encoder_inputs = tokenizer(documents, padding=True, truncation=True, return_tensors="pt")
    encoded_decoder_inputs = tokenizer(querys, padding=True, truncation=True, return_tensors="pt")
    with torch.no_grad():
        outputs = model(input_ids=encoded_encoder_inputs["input_ids"],
                        labels=encoded_decoder_inputs["input_ids"],
                        attention_mask=encoded_encoder_inputs["attention_mask"])
        batch_logits = outputs[1]
        print(batch_logits)


documents = ['my dog is cute!', "so am I."]
query = "who am I?"
prediction(documents, query)

here is my twice outputs (run the above code twice):

tensor([[[-14.6796,  -6.2236, -13.3517,  ..., -42.7422, -42.7124, -42.8204],
         [-18.5999,  -4.9112, -10.6610,  ..., -40.7506, -40.7313, -40.8373],
         [-17.3894,  -5.3482, -11.4917,  ..., -41.2223, -41.1643, -41.3228],
         [-18.4449,  -5.9145, -12.0056,  ..., -42.3857, -42.3579, -42.4859]],
        [[-15.7967,  -6.9833, -14.5827,  ..., -41.3168, -41.0326, -40.9567],
         [-18.4241,  -5.7193, -12.0748,  ..., -40.1744, -40.0635, -39.9045],
         [-19.5852,  -5.1691, -12.7764,  ..., -42.2655, -42.0788, -41.9885],
         [-20.3673,  -3.6864, -12.5264,  ..., -40.1189, -40.0787, -39.8976]]])

tensor([[[-14.6796,  -6.2236, -13.3517,  ..., -42.7422, -42.7124, -42.8204],
         [-18.5848,  -4.9116, -10.6607,  ..., -40.7443, -40.7251, -40.8300],
         [-17.3248,  -5.3050, -11.4988,  ..., -41.1802, -41.1236, -41.2818],
         [-18.3950,  -5.8967, -11.9756,  ..., -42.3553, -42.3273, -42.4553]],
        [[-15.7967,  -6.9833, -14.5827,  ..., -41.3168, -41.0326, -40.9567],
         [-18.5133,  -5.7466, -12.1301,  ..., -40.2199, -40.1078, -39.9497],
         [-19.3636,  -5.1246, -12.7134,  ..., -42.1881, -41.9993, -41.9106],
         [-20.3247,  -3.6335, -12.4559,  ..., -40.0285, -39.9912, -39.8106]]])

Am I did anything wrong?

Issue Analytics

State:
Created 3 years ago
Comments:15 (4 by maintainers)

Top GitHub Comments

1reaction

ednussicommented, Oct 29, 2020

Thanks @patrickvonplaten. Was following your suggestion, and after reading through the documentation and code of how the decoder_input_ids is used, it became clear why it affects the logits and helped clear my confusion.

1reaction

ednussicommented, Oct 27, 2020

Hi @ArvinZhuang, Yes, similar to you when I run the model call with same input_ids but different labels I got different outcomes (using pytorch). Realized I missed a part of the print so edited my comment above to match the output. Hoping @patrickvonplaten or someone from the huggingface team can take a second look, so we can get to the bottom of this.

Top Results From Across the Web

T5 - Hugging Face

T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...

Data to Text generation with T5; Building a simple yet ...

The T5 allows us to use the same model along with the loss function and ... Now let's take a look at the...

Output logits from T5 model for text generation purposes

How can I output the logits of the T5 model directly given a text input for generation purposes (not training)?. I want to...

Neural machine translation with a Transformer and Keras | Text

A Transformer is a sequence-to-sequence encoder-decoder model similar to the ... in the data between distant positions in the input or output sequences....

Exploring the Limits of Transfer Learning with a ... - YouTube

This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then ...