Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LXMERT pre-training tasks

See original GitHub issue

❓ Questions & Help

Hello, congrats to all contributors for the awesome work with LXMERT! It is exciting to see multimodal transformers coming to hugginface/transformers. Of course, I immediately tried it out and played with the demo.

LXMERT pre-trained model, trained on what exactly?

Question: Does the line lxmert_base = LxmertForPreTraining.from_pretrained("unc-nlp/lxmert-base-uncased") load an already pre-trained LXMERT model on the tasks enumerated in the original paper “(1) masked crossmodality language modeling, (2) masked object prediction via RoI-feature regression, (3) masked object prediction via detected-label classification, (4) cross-modality matching, and (5) image question answering.” (Tan & Bansal, 2019)? If the pre-training tasks are not all the ones from the paper, would that line load pre-trained weights at all and if yes, on what?

Thanks in advance! 🤗

A link to original question on the forum/Stack Overflow: Here is the link to the hugginface forum.

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:23 (7 by maintainers)

Top GitHub Comments

1reaction

yezhengli-Mr9commented, Dec 25, 2020

Is there any entry-level example of Lxmert？ Following example from Lxmert.

from transformers import LxmertTokenizer, LxmertModel
import torch

tokenizer = LxmertTokenizer.from_pretrained('unc-nlp/lxmert-base-uncased')
model = LxmertModel.from_pretrained('unc-nlp/lxmert-base-uncased')

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state

comes up

File "/Users/yezli/miniconda3/lib/python3.8/site-packages/transformers/models/lxmert/modeling_lxmert.py", line 933, in forward
    assert visual_feats is not None, "`visual_feats` cannot be `None`"
AssertionError: `visual_feats` cannot be `None`

1reaction

eltoto1219commented, Sep 22, 2020

Hi, “unc-nlp/lxmert-base-uncased” was trained with all tasks specified in the paper (as aforementioned). We have benchmarked the pre-trained model to make sure it reaches the same performance on all QA tasks. If you do run into any troubles though, please let me know!

Top Results From Across the Web

LXMERT - Hugging Face

These tasks help in learning both intra-modality and cross-modality relationships. After fine-tuning from our pretrained parameters, our model achieves the ...

[1908.07490] LXMERT: Learning Cross-Modality Encoder ...

... via five diverse representative pre-training tasks: masked language modeling, masked object prediction (feature regression and label ...

LXMERT: Learning Cross-Modality Encoder Representations ...

LXMERT, A Vision Language Model for VQA, GQA, NLVR². ... LXMERT is pre-trained with multiple pre-training tasks and hence multiple losses are involved....

LXMERT: Learning Cross-Modality Encoder ... - ACL Anthology

Lastly, we conduct several analysis and ablation studies to prove the effectiveness of our model components and diverse pre-training tasks by re- moving...

LXMERT:

Fine-tuning on Vision and Language Tasks? 14. Language. Pre-training: Visual. Pre-training: Image. Classification. Language.