SentenceTransformer API vs. Transformer API + pooling
See original GitHub issueIn your documentation you mention two approaches to using your package to create sentence embeddings.
First, from the Quickstart, you wrote:
model = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens')
#Our sentences we like to encode
sentences = ['This framework generates embeddings for each input sentence',
'Sentences are passed as a list of string.',
'The quick brown fox jumps over the lazy dog.']
#Sentences are encoded by calling model.encode()
sentence_embeddings = model.encode(sentences)
print(sentence_embeddings.shape)
# (3, 768)
Second, from Sentence Embeddings with Transformers, you wrote:
model = AutoModel.from_pretrained("sentence-transformers/bert-base-nli-mean-tokens")
# Model is of type: transformers.modeling_bert.BertModel
#Compute token embeddings
with torch.no_grad():
model_output = model(**encoded_input)
#Perform pooling. In this case, mean pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print(sentence_embeddings.shape)
# torch.Size([3, 768])
What are the important differences between these two approaches? The only thing I can see is that in the second approach, the BertModel
model returns token embeddings and then you manually perform pooling (mean or max). If I use this second approach, what would I be missing from using SentenceTransformer.encode()
?
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (5 by maintainers)
Top Results From Across the Web
SentenceTransformer — Sentence-Transformers documentation
Loads or create a SentenceTransformer model, that can be used to map ... pool – A pool of workers started with SentenceTransformer.start_multi_process_pool.
Read more >sentence-transformers/nli-bert-base-cls-pooling - Hugging Face
This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks...
Read more >Sentence Transformers and Embeddings - Pinecone
These increasingly rich sentence embeddings can be used to quickly compare sentence similarity for various use cases. Such as:.
Read more >Embedding Models - BERTopic
Or select a SentenceTransformer model with your parameters: ... import spacy from thinc.api import set_gpu_allocator, require_gpu nlp ...
Read more >Design your own sentence transformer with SBERT (SBERT 3)
Download any of the models from HuggingFace and build your own sentence transformer with that particular model. Maybe BERT_large or RoBERTa, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@githubrandomuser2017 When SBERT was created, GPT2 was not available.
I never tested GPT2, but I think Mask Language Modeling as used in BERT is the better pre-training task to get sentence embeddings than the causal language model used by GPT2.
But it will be easy to fine-tune and test GPT2 with sentence-transformers.
@nreimers Why don’t you use GPT2 as the basis of a Sentence Transformer model?