Transformer models above version 4.1.1 can't be wrapped with the configure_interpretable_embedding_layer
See original GitHub issue🐛 Bug
When using the latest version of the transformers package, the models cannot be wrapped with the configure_interpretable_embedding_layer function. There is a line in the new package version that gets the shape of the input the following way : batch_size, seq_length = input_shape
, but the input we pass has three dimensions as we pass the embedding. So we get the ‘too many values to unpack (expected 2)’ error. Is it possible to solve this problem with the latest version of the package?
To Reproduce
You can see the issue on this Colab notebook:
If you uncomment the !pip install transformers==4.1.1
line, which installs a lower version, the error is not present.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:7 (5 by maintainers)
Top Results From Across the Web
LayoutLM — transformers 4.1.1 documentation
In this paper, we propose the textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is ...
Read more >Source code for captum.attr._models.base
To do so, we separate embedding layers from the model, ... this will ensure that we can execute the forward pass using interpretable...
Read more >Bilingual is At Least Monolingual (BALM):
edge about a language by encoding sentences as fixed-length embeddings. ... English transformer models have over 110 million parameters – the size of...
Read more >Utilizing Transformer Representations Efficiently
Preliminary Setup. Import all required libraries and import modules/utilities. We'll be working with the in this notebook to visualize various embedding layers.
Read more >Multi-dimensional patient acuity estimation with ...
Transformer model architectures have revolutionized the natural language processing (NLP) domain and continue to produce state-of-the-art results in ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thank you @NarineK for the suggestion. I don’t use Integrated Gradients, but showcase other approaches like Saliency and InputXGradient in that tutorial.
I just tried the previous idea using the embeddings parameter and it worked. I did the following:
More importantly, this doesn’t need the configure_interpretable_embedding_layer/remove_interpretable_embedding_layer functionality and would not even work if you accidentally use it (which was my mistake last time I tried).
I would even recommend using this approach whenever possible – having the possibility of passing the embeddings or the token_ids to a model instead of the configure_interpretable_embedding_layer/remove_interpretable_embedding_layer functionality, as there’s more control on the user side.
It would be good to update it. There is a case when I look into token, word and position embeddings separately with the configure layer. We might be able to do that with multi-layer IG right now + layer conductance case would needs a fix too.