Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SentenceTransformer API vs. Transformer API + pooling

See original GitHub issue

In your documentation you mention two approaches to using your package to create sentence embeddings.

First, from the Quickstart, you wrote:

model = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens')

#Our sentences we like to encode
sentences = ['This framework generates embeddings for each input sentence',
    'Sentences are passed as a list of string.', 
    'The quick brown fox jumps over the lazy dog.']

#Sentences are encoded by calling model.encode()
sentence_embeddings = model.encode(sentences)
print(sentence_embeddings.shape)
# (3, 768)

Second, from Sentence Embeddings with Transformers, you wrote:

model = AutoModel.from_pretrained("sentence-transformers/bert-base-nli-mean-tokens")
# Model is of type: transformers.modeling_bert.BertModel

#Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

#Perform pooling. In this case, mean pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print(sentence_embeddings.shape)
# torch.Size([3, 768])

What are the important differences between these two approaches? The only thing I can see is that in the second approach, the BertModel model returns token embeddings and then you manually perform pooling (mean or max). If I use this second approach, what would I be missing from using SentenceTransformer.encode()?

Issue Analytics

State:
Created 3 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

nreimerscommented, Sep 21, 2020

@githubrandomuser2017 When SBERT was created, GPT2 was not available.

I never tested GPT2, but I think Mask Language Modeling as used in BERT is the better pre-training task to get sentence embeddings than the causal language model used by GPT2.

But it will be easy to fine-tune and test GPT2 with sentence-transformers.

0reactions

githubrandomuser2017commented, Sep 19, 2020

@nreimers Why don’t you use GPT2 as the basis of a Sentence Transformer model?

Top Results From Across the Web

SentenceTransformer — Sentence-Transformers documentation

Loads or create a SentenceTransformer model, that can be used to map ... pool – A pool of workers started with SentenceTransformer.start_multi_process_pool.

sentence-transformers/nli-bert-base-cls-pooling - Hugging Face

This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks...

Sentence Transformers and Embeddings - Pinecone

These increasingly rich sentence embeddings can be used to quickly compare sentence similarity for various use cases. Such as:.

Embedding Models - BERTopic

Or select a SentenceTransformer model with your parameters: ... import spacy from thinc.api import set_gpu_allocator, require_gpu nlp ...

Design your own sentence transformer with SBERT (SBERT 3)

Download any of the models from HuggingFace and build your own sentence transformer with that particular model. Maybe BERT_large or RoBERTa, ...