question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SentenceTransformer API vs. Transformer API + pooling

See original GitHub issue

In your documentation you mention two approaches to using your package to create sentence embeddings.

First, from the Quickstart, you wrote:

model = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens')

#Our sentences we like to encode
sentences = ['This framework generates embeddings for each input sentence',
    'Sentences are passed as a list of string.', 
    'The quick brown fox jumps over the lazy dog.']

#Sentences are encoded by calling model.encode()
sentence_embeddings = model.encode(sentences)
print(sentence_embeddings.shape)
# (3, 768)

Second, from Sentence Embeddings with Transformers, you wrote:

model = AutoModel.from_pretrained("sentence-transformers/bert-base-nli-mean-tokens")
# Model is of type: transformers.modeling_bert.BertModel

#Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

#Perform pooling. In this case, mean pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print(sentence_embeddings.shape)
# torch.Size([3, 768])

What are the important differences between these two approaches? The only thing I can see is that in the second approach, the BertModel model returns token embeddings and then you manually perform pooling (mean or max). If I use this second approach, what would I be missing from using SentenceTransformer.encode()?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, Sep 21, 2020

@githubrandomuser2017 When SBERT was created, GPT2 was not available.

I never tested GPT2, but I think Mask Language Modeling as used in BERT is the better pre-training task to get sentence embeddings than the causal language model used by GPT2.

But it will be easy to fine-tune and test GPT2 with sentence-transformers.

0reactions
githubrandomuser2017commented, Sep 19, 2020

@nreimers Why don’t you use GPT2 as the basis of a Sentence Transformer model?

Read more comments on GitHub >

github_iconTop Results From Across the Web

SentenceTransformer — Sentence-Transformers documentation
Loads or create a SentenceTransformer model, that can be used to map ... pool – A pool of workers started with SentenceTransformer.start_multi_process_pool.
Read more >
sentence-transformers/nli-bert-base-cls-pooling - Hugging Face
This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks...
Read more >
Sentence Transformers and Embeddings - Pinecone
These increasingly rich sentence embeddings can be used to quickly compare sentence similarity for various use cases. Such as:.
Read more >
Embedding Models - BERTopic
Or select a SentenceTransformer model with your parameters: ... import spacy from thinc.api import set_gpu_allocator, require_gpu nlp ...
Read more >
Design your own sentence transformer with SBERT (SBERT 3)
Download any of the models from HuggingFace and build your own sentence transformer with that particular model. Maybe BERT_large or RoBERTa, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found