Add ViLT to HuggingFace Transformers
See original GitHub issueHi,
I’ve been reading the ViLT paper and was impressed by the simplicity, as it only adds text embeddings to a ViT.
As ViT is already available in HuggingFace Transformers, adding ViLT should be relatively easy.
I’ve currently implemented the model (see here for my current implementation). It includes a conversion script (convert_vilt_original_to_pytorch.py
) to convert the weights from this repository (the PyTorch Lightning module) to its HuggingFace counterpart, for all models (base one + the ones with a head on top).
However, I’m facing some issues when performing a forward pass with the original implementation in Google Colab (when just doing pip install -r requirements.txt
and running the demo_vqa.py
script, you get the following):
Traceback (most recent call last):
File "demo_vqa.py", line 17, in <module>
from vilt.modules import ViLTransformerSS
File "/content/ViLT/vilt/modules/__init__.py", line 1, in <module>
from .vilt_module import ViLTransformerSS
File "/content/ViLT/vilt/modules/vilt_module.py", line 3, in <module>
import pytorch_lightning as pl
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/__init__.py", line 62, in <module>
from pytorch_lightning import metrics
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/__init__.py", line 14, in <module>
from pytorch_lightning.metrics.metric import Metric
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/metric.py", line 23, in <module>
from pytorch_lightning.metrics.utils import _flatten, dim_zero_cat, dim_zero_mean, dim_zero_sum
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/utils.py", line 18, in <module>
from pytorch_lightning.utilities import rank_zero_warn
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/__init__.py", line 24, in <module>
from pytorch_lightning.utilities.apply_func import move_data_to_device
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>
from torchtext.data import Batch
ImportError: cannot import name 'Batch' from 'torchtext.data' (/usr/local/lib/python3.7/dist-packages/torchtext/data/__init__.py)
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev@python.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
Upgrading PyTorch Lightning to the latest version also returns an error:
Traceback (most recent call last):
File "demo_vqa.py", line 17, in <module>
from vilt.modules import ViLTransformerSS
File "/content/ViLT/vilt/modules/__init__.py", line 1, in <module>
from .vilt_module import ViLTransformerSS
File "/content/ViLT/vilt/modules/vilt_module.py", line 7, in <module>
from vilt.modules import heads, objectives, vilt_utils
File "/content/ViLT/vilt/modules/vilt_utils.py", line 11, in <module>
from vilt.gadgets.my_metrics import Accuracy, VQAScore, Scalar
File "/content/ViLT/vilt/gadgets/my_metrics.py", line 2, in <module>
from pytorch_lightning.metrics import Metric
ModuleNotFoundError: No module named 'pytorch_lightning.metrics'
As PL deprecated the metrics module.
Are you able to provide a simple Colab notebook to perform inference on an image+text pair?
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (5 by maintainers)
Top GitHub Comments
Hi,
ViLT has been added some time ago to Huggingface Transformers! 🥳
Docs can be found here: https://huggingface.co/docs/transformers/main/en/model_doc/vilt
Demo notebooks can be found here.
All models can be found on the hub.
https://huggingface.co/dandelin/vilt-b32-mlm-itm https://huggingface.co/dandelin/vilt-b32-finetuned-vqa
I’ve made these two repos as I saw you converted these two models. Though I would like to convert the rest models we’ve released (IRTR, NLVR2) in my spare time, following your HF transformers implementation (and writing model cards too.)
Thanks!