question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pooling Strategy Question

See original GitHub issue

In the Original S-BERT paper, you mentioned

“Researchers have started to input individual sentences into BERT and to derive fixed size sentence embeddings. The most commonly used approach is to average the BERT output layer (known as BERT embeddings) or by using the output of the first token (the [CLS] token). As we will show, this common practice yields rather bad sentence embeddings, often worse than averaging GloVe embeddings (Pennington et al., 2014).”

When you say the average the BERT output layer do you mean the average pooling of the last layer's hidden state? if so how is it different from what Sentence Transformer does? Doesn’t Sentence Transformers by default use average (mean) pooling on tokens of the last layer’s hidden state (and optionally supports max pooling and CLS pooling if I am not wrong. )

So I am confused when you say before sentence transformers,

researchers used to get fixed dimensions by commonly used approach is to average the BERT output layer (known as BERT embeddings)

Also, the BERT author hints (below) at the average pooling of word (or token) embeddings may not yield a good sentence embedding. All along I was under the impression this is what sentence transformed mean pooling does to get sentence embeddings.

68367897-b7837780-0171-11ea-8047-89cfc89184d8

Could you please clarify?

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, Jul 12, 2022

Not sure if I get the question. The output of BERT is averaged. But BERT needs fine-tuning on suitable data to produce meaningful text embeddings.

0reactions
PrithivirajDamodarancommented, Jul 12, 2022

Understood, thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Sample pooling strategies for SARS-CoV-2 detection - PMC
A main question about the pooling method is the possible impact on the accuracy of the results. qRT-PCR has an inherent bias itself,...
Read more >
Pooling strategies for COVID-19 testing
Some simple pooling strategies. While the SARS-CoV-2 virus is new, the problem of testing individuals in a large population is not. Our story...
Read more >
Whitepaper: Guidance on Pooling of Samples for SARS-CoV-2
Pooled testing strategies can increase the efficiency, speed, and positive predictive value of diagnostics for case identification.
Read more >
Interim Guidance for Use of Pooling Procedures in SARS-CoV ...
Find guidance for laboratories on the use of pooling strategies for SARS-CoV-2 testing, how to interpret the results, and how to report the...
Read more >
A review of pooled‐sample strategy: Does complexity lead to ...
A pool size of 10 is recommended in this door-to-door strategy. A similar study designed a simple questionnaire, which included questions ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found