encoder-decoder attention?
See original GitHub issueDear Author,
Thank you very much for the visualization tool. It is super helpful.
Now, we need to visualize the encoder-decoder attention instead of the self-attention. For instance, while using the following code, we only can visualize the self-attention one. It would be great if you can give some clue on how to visualize the encoder-decoder attention of BART.
from bertviz import model_view
from bertviz import head_view
from transformers import BartForConditionalGeneration, BartTokenizer
model = BartForConditionalGeneration.from_pretrained("facebook/bart-large", force_bos_token_to_be_generated=True)
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
sentence_a = "The cat sat on the mat"
sentence_b = "The cat lay on the rug"
inputs = tokenizer.encode_plus(sentence_a, sentence_b, return_tensors='pt', add_special_tokens=True)
input_ids = inputs['input_ids']
attention = model(input_ids,output_attentions=True)[2] ####
#print (attention.size())
input_id_list = input_ids[0].tolist() # Batch index 0
tokens = tokenizer.convert_ids_to_tokens(input_id_list)
model_view(attention, tokens)
head_view(attention, tokens)
Thank you! Best, Shirley
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
How Does Attention Work in Encoder-Decoder Recurrent ...
Attention is a mechanism that was developed to improve the performance of the Encoder-Decoder RNN on machine translation.
Read more >A Guide to the Encoder-Decoder Model and the Attention ...
“Attention allows the model to focus on the relevant parts of the input sequence as needed, accessing all the past hidden states of...
Read more >An Explanation of Attention Based Encoder-Decoder Deep ...
Attention focuses on the most important parts of the sequence instead of the entire sequence as a whole. Rather than building a single...
Read more >Seq2seq and Attention - Lena Voita
An attention mechanism is a part of a neural network. At each decoder step, it decides which source parts are more important. In...
Read more >What is attention mechanism? - Towards Data Science
An encoder decoder architecture is built with RNN and it is widely used in neural machine translation (NMT) and sequence to sequence (Seq2Seq) ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @xwuShirley, I’ve started a branch to visualize encoder-decoder attention: https://github.com/jessevig/bertviz/tree/encoder-decoder
I’ve added the
encoder_decoder.ipynb
notebook.Please let me know if that works for you. I’ll work on adding support for the model view and pushing a new version to pypi as well soon.
Super helpful!! Thanks!