PAD symbols change the output
See original GitHub issueAdding [PAD]
symbols to an input sentence changes the output of the model. I put together a small example here:
https://gist.github.com/juditacs/8be068d5f9063ad68e3098a473b497bd
I also noticed that the seed state affects the output as well. Resetting it in every run ensures that the output is always the same. Is this because of layernorm?
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Formatting individual symbols ... - GraphPad Prism 9 User Guide
Right-mouse click on the point and choose Format this Point. You can change the symbol shape, size, color and the format, color and...
Read more >Preprocess - Hugging Face
Padding is a strategy for ensuring tensors are rectangular by adding a special padding token to shorter sentences. Set the padding parameter to...
Read more >String Padding in C - Stack Overflow
will output |Hello |. In this case the - symbol means "Left align", the 10 means "Ten characters in field" and the s...
Read more >Wrong Character/Symbols being Output when Typing in ... - Dell
Go to Control Panel > Region and Language > Keyboards and Languages Tab. Click "Change Keyboards". In the "Text Services and Input Languages ......
Read more >objcopy(1) - Linux manual page - man7.org
This is different from --change-leading-char because it always changes the symbol name when appropriate, regardless of the object file format of the output...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Due to Position Embeddings every token results in different vectors. You might want to google “How the Embedding Layers in BERT Were Implemented”
@thomwolf
Despite the
attention_mask
the values are a slightly different.It is normal that
[PAD]
vectors have different values?here is the output
[PAD]
's are different, is that normal?7 0.28312715888023376 8 0.08457585424184799 9 -0.3077544569969177