Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add ViLT to HuggingFace Transformers

See original GitHub issue

Hi,

I’ve been reading the ViLT paper and was impressed by the simplicity, as it only adds text embeddings to a ViT.

As ViT is already available in HuggingFace Transformers, adding ViLT should be relatively easy.

I’ve currently implemented the model (see here for my current implementation). It includes a conversion script (convert_vilt_original_to_pytorch.py) to convert the weights from this repository (the PyTorch Lightning module) to its HuggingFace counterpart, for all models (base one + the ones with a head on top).

However, I’m facing some issues when performing a forward pass with the original implementation in Google Colab (when just doing pip install -r requirements.txt and running the demo_vqa.py script, you get the following):

Traceback (most recent call last):
  File "demo_vqa.py", line 17, in <module>
    from vilt.modules import ViLTransformerSS
  File "/content/ViLT/vilt/modules/__init__.py", line 1, in <module>
    from .vilt_module import ViLTransformerSS
  File "/content/ViLT/vilt/modules/vilt_module.py", line 3, in <module>
    import pytorch_lightning as pl
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/__init__.py", line 62, in <module>
    from pytorch_lightning import metrics
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.metric import Metric
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/metric.py", line 23, in <module>
    from pytorch_lightning.metrics.utils import _flatten, dim_zero_cat, dim_zero_mean, dim_zero_sum
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/utils.py", line 18, in <module>
    from pytorch_lightning.utilities import rank_zero_warn
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/__init__.py", line 24, in <module>
    from pytorch_lightning.utilities.apply_func import move_data_to_device
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>
    from torchtext.data import Batch
ImportError: cannot import name 'Batch' from 'torchtext.data' (/usr/local/lib/python3.7/dist-packages/torchtext/data/__init__.py)

If you suspect this is an IPython bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev@python.org

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

Upgrading PyTorch Lightning to the latest version also returns an error:

Traceback (most recent call last):
  File "demo_vqa.py", line 17, in <module>
    from vilt.modules import ViLTransformerSS
  File "/content/ViLT/vilt/modules/__init__.py", line 1, in <module>
    from .vilt_module import ViLTransformerSS
  File "/content/ViLT/vilt/modules/vilt_module.py", line 7, in <module>
    from vilt.modules import heads, objectives, vilt_utils
  File "/content/ViLT/vilt/modules/vilt_utils.py", line 11, in <module>
    from vilt.gadgets.my_metrics import Accuracy, VQAScore, Scalar
  File "/content/ViLT/vilt/gadgets/my_metrics.py", line 2, in <module>
    from pytorch_lightning.metrics import Metric
ModuleNotFoundError: No module named 'pytorch_lightning.metrics'

As PL deprecated the metrics module.

Are you able to provide a simple Colab notebook to perform inference on an image+text pair?

Thanks!

Issue Analytics

State:
Created 2 years ago
Comments:13 (5 by maintainers)

Top GitHub Comments

2reactions

NielsRoggecommented, May 7, 2022

Hi,

ViLT has been added some time ago to Huggingface Transformers! 🥳

Docs can be found here: https://huggingface.co/docs/transformers/main/en/model_doc/vilt

Demo notebooks can be found here.

All models can be found on the hub.

1reaction

dandelincommented, Nov 30, 2021

https://huggingface.co/dandelin/vilt-b32-mlm-itm https://huggingface.co/dandelin/vilt-b32-finetuned-vqa

I’ve made these two repos as I saw you converted these two models. Though I would like to convert the rest models we’ve released (IRTR, NLVR2) in my spare time, following your HF transformers implementation (and writing model cards too.)

Thanks!

Top Results From Across the Web

ViLT - Hugging Face

ViLT incorporates text embeddings into a Vision Transformer (ViT), allowing it to have a minimal design for Vision-and-Language Pre-training (VLP). The abstract ...

How to add a model to Transformers? - Hugging Face

Step-by-step recipe to add a model to Transformers ... Everyone has different preferences of how to port a model so it can be...

ViLT - Hugging Face

In this paper, we present a minimal VLP model, Vision-and-Language Transformer (ViLT), monolithic in the sense that the processing of visual inputs is ......

Share a model - Hugging Face

Transformers will even automatically add training hyperparameters, training results and framework versions to your model card!

How to add a model to Transformers? - Hugging Face

Adding a new model is often difficult and requires an in-depth knowledge of the Transformers library and ideally also of the model's original...