Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problems with VQA finetuning

See original GitHub issue

Hello! I am trying to finetune OFA-large on VQA using Visual Genome dataset, using the finetuning instruction in the repo. Unfortunately, I have encountered a bug that I have some difficulties indentifying. I preprocessed the data exactly like in an example, but during training my gradients overflow and model does not train.

slice_id 0 seek offset 0
2022-03-28 02:29:07 - trainer.py[line:703] - INFO: begin training epoch 1
2022-03-28 02:29:07 - train.py[line:296] - INFO: Start iterating over samples
2022-03-28 02:29:09 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 64.0
2022-03-28 02:29:11 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 32.0
2022-03-28 02:29:14 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 16.0
2022-03-28 02:29:15 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 8.0
2022-03-28 02:29:17 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 4.0
2022-03-28 02:29:19 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 2.0
2022-03-28 02:29:22 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 1.0
2022-03-28 02:29:23 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 0.5
2022-03-28 02:29:26 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 0.25
2022-03-28 02:29:28 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 0.125
2022-03-28 02:29:28 - trainer.py[line:922] - INFO: NOTE: gradient overflow detected, ignoring gradient, setting loss scale to: 0.0625

I narrowed the issue to the answers column. If I replace this column in my dataset with the column in the dataset provided in the repo, everything works fine. However, if I change the answers in the column, or even modify them in any way I get the same issue. I suspected that my procedure of changing the column could be a problem, but if I “modify” the column with empty string, it still works. Any other symbol added to the column again concludes in an overflow. I also tried modifying not the whole column, but single elements, and found out that changing certain answers does not lead to an overflow, while changing others does. I was unable to further narrow the issue or find any pattern in it.

I train on single server with 1 GPU.

Issue Analytics

State:
Created a year ago
Comments:9 (5 by maintainers)

Top GitHub Comments

1reaction

yangapkucommented, Oct 21, 2022

@phanxuanphucnd Yes. In our practice on VQAv2 dataset which has a long-tailed distribution of all the appeared ground-truth answers, we follow the common practice which uses the most frequent 3,129 answers as the candidate set to build this dict. Then we filtered the original training and valid split, only the question-answer pair whose answer is in this candidate set is kept for finetuning OFA.

1reaction

yangapkucommented, Oct 21, 2022

Hi, it’s a python-dict which does mapping from candidate answer text to its index (starting from 0). The indexes can be assigned just by random with no specific rules. Just make sure that each candidate answer is assigned with a unique index and the indexes are assigned continuously from 0. All the ground-truth answers of training and validation samples should be included in this candidate answer set.

Top Results From Across the Web

fine-tune on VQA · Issue #31 · jackroos/VL-BERT

I want to use VLBERT to fine-tune on VQA dataset. The pretrained weights have 12-layer ... luomancs opened this issue on May 20,...

Finetuning Pretrained Vision-Language Models with ...

However, finetuning large-scale VL-PMs with limited data for VQA usually faces overfitting and poor generalization issues, leading to a lack ...

Towards Fine-Tuning of VQA Models in Public Datasets

Abstract. This paper studies Visual Question Answering (VQA) topic, which combines Computer Vision (CV), Natural Language Processing (NLP) and Knowl-.

VQA on Simpsons scenes: Transfer learning from pre-trained ...

Using transfer learning, we want to fine-tune a pre- trained VQA model to answer a natural language Yes or No ques- tion given...

Fine-tuning your answers: a bag of tricks for improving VQA ...

AbstractIn this paper, one of the most novel topics in Deep Learning (DL) is explored: Visual Question Answering (VQA).