What does .encode() actually do to extract the embeddings ?
See original GitHub issueLooked at the code, it reads .encode() obtains the embeddings by looping through all hidden states (i.e. sequence outputs from all layers).
Question: But the shape of the output embeddings is 1D, it just has the (hidden dim size,)
i.e (512, or 768,), as opposed to 2D (input_ids, hidden dim size)
for text models and (pixel_values, hidden dim size)
for vision models. So are the sentence embeddings obtained are got by adding an MLP head on top of the sequence output of all layers ? or by adding an MLP head on top of the pooler output of the sequence output of all layers
? Please clarify.
Also the MLP head is the classic one i.e. simple linear layer within, out features + classic tanh activation, and no dropouts of any sort ?
Please advice …
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Word Embedding in NLP: One-Hot Encoding and Skip-Gram ...
The simplest method is called one-hot encoding, also known as “1-of-N” encoding (meaning the vector is composed of a single one and a...
Read more >NLP: Everything about Embeddings - Medium
Embedding methods (alternatively referred to as “encoding”, “vectorising”, etc) convert symbolic representations (i.e. words, emojis, ...
Read more >Embed, encode, attend, predict: The new deep learning ...
The new approach can be summarised as a simple four-step formula: embed, encode, attend, predict. This post explains the components of this ...
Read more >Top 4 Sentence Embedding Techniques using Python!
Just like SentenceBERT, we take a pair of sentences and encode them to generate the actual sentence embeddings. Then, extract the relations ...
Read more >How to Encode Text Data for Machine Learning with scikit-learn
The text must be parsed to remove words, called tokenization. Then the words need to be encoded as integers or floating point values...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Just the mean of the last layer
In most cases mean pooling is performed: All output embeddings are averaged to give the fixed representation