Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trainer.evaluate does not support seq2seq models

See original GitHub issue

🐛 Bug

Information

Hi! I can’t thank you enough for Transformers. I know that the Trainer is still under development, but would like to report this just to know the current status.

Currently Trainer._prediction_loop assumes that different batches of data have the same shape. Specifically, this line

preds = torch.cat((preds, logits.detach()), dim=0)

This does not allow to use Trainer.evaluate for models with a variable output (e.g. seq2seq models). One of the possible solutions is to pad all batches to the same length, but it is pretty inefficient.

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

create seq2seq model
pad batches in such a way that each batch is padded to the maximum length within batch
create Trainer for the model, call .evaluate()

Traceback (most recent call last):
  File "/home/vlialin/miniconda3/lib/python3.7/site-packages/transformers/trainer.py", line 509, in train
    self.evaluate()
  File "/home/vlialin/miniconda3/lib/python3.7/site-packages/transformers/trainer.py", line 696, in evaluate
    output = self._prediction_loop(eval_dataloader, description="Evaluation")
  File "/home/vlialin/miniconda3/lib/python3.7/site-packages/transformers/trainer.py", line 767, in _prediction_loop
    preds = torch.cat((preds, logits.detach()), dim=0)
RuntimeError: Sizes of tensors must match except in dimension 0. Got 29 and 22 in dimension 1