google/pegasus-cnn_dailymail generates blank file
See original GitHub issueEnvironment info
transformers
version: 4.2.0 and 4.5.1- Platform: linux
- Python version: 3.6
- PyTorch version (GPU?): 1.7.1
- Tensorflow version (GPU?): NA
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Yes (and I also try to not use distributed but problem exists)
Who can help
@patrickvonplaten, @patil-suraj
Information
Model I am using (Bert, XLNet …): google/pegasus-cnn_dailymail
The problem arises when using:
- the official example scripts: run_distributed_eval.py from https://github.com/huggingface/transformers/tree/master/examples/legacy/seq2seq
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: summarization with ROUGE
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
I am trying to generate the summaries from Pegasus on CNN/DM and XSUM datasets. I use the same dataset shared by HuggingFace (from README.md in https://github.com/huggingface/transformers/tree/master/examples/legacy/seq2seq). My experiments are run on 3 V100 GPUs. I use google/pegasus-cnn_dailymail
for CNN/DM and google/pegasus-xsum
for XSUM.
- The results on XSUM is perfect. I run the following code and receive the ROUGE score as:
{'rouge1': 47.0271, 'rouge2': 24.4924, 'rougeL': 39.2529, 'n_obs': 11333, 'seconds_per_sample': 0.035, 'n_gpus': 3}
python -m torch.distributed.launch --nproc_per_node=3 run_distributed_eval.py \
--model_name google/pegasus-xsum \
--save_dir $OUTPUT_DIR \
--data_dir $DATA_DIR \
--bs 64 \
--fp16
- I was expecting similar SOTA performance on CNNDM, so I run the following code and receive:
{"n_gpus": 3, "n_obs": 11490, "rouge1": 0.1602, "rouge2": 0.084, "rougeL": 0.1134, "seconds_per_sample": 0.1282}
.
(Note: here the batch size is changed due to memory limitation. Although experiments are performed on the same devices, CNN/DM requires more spaces considering the unique feature of dataset itself.)
python -m torch.distributed.launch --nproc_per_node=3 run_distributed_eval.py \
--model_name google/pegasus-cnn_dailymail \
--save_dir $OUTPUT_DIR \
--data_dir $DATA_DIR \
--bs 32 \
--fp16
- I look at the generated
test_generations.txt
file to try to figure out whygoogle/pegasus-cnn_dailymail
doesn’t work. Then I found most of lines intest_generations.txt
are blank. (Please using the attached image for an example)
Expected behavior
It is so wired that google/pegasus-xsum
works out perfectly while google/pegasus-cnn_dailymail
does not generate summaries successfully. I am confused so I switch the transformers version (4.2.0 and 4.5.1), and I re-run the experiments on different GPUs. This problem exists. Could you please give me any suggestions? Thank you!
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (8 by maintainers)
Top GitHub Comments
Hi @chz816
I can reproduce the issue. This is because pegasus doesn’t really work with
fp16
since its trained withbfloat16
, so in most cases, it overflows and returnsnan
logits. The model works as expected infp32
, so if you run the above command without the--fp16
arg, it should give the expected results.cc @stas00
I’m able to reproduce this with the “modern” version of the script: