Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

conceptual question about the embeddings

See original GitHub issue

Hello there! and thanks for this package. It is really super fast and efficient.

I just have a conceptual question about the models that are available in sentence-transformers. Are they trained for a particular task or they come from big unsupervised masked language models (say BERT) trained on massive amounts of text?

For instance, if I were to train a BERT masked language model from scratch using huggingface I would also obtain embeddings at the sentence level. Would these embeddings be similar to the ones available in sentence-transformers? The reason I am asking is because I want to trained a masked language model on my highly specialized corpus of text.

Thanks!

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

2reactions

nreimerscommented, Jan 5, 2022

Yes

1reaction

nreimerscommented, Jan 5, 2022

Only one

Read more comments on GitHub >

Top Results From Across the Web

conceptual question about triplet loss embeddings

It is my understanding that the input to the CNN is a triplet (hence the name) of three pictures or thumbnails - one...

Conceptual questions about transformers

transformers models create contextual embeddings . That is, the embedding of a word depend on the words around it in the sentence. Does...

Learning Answer Embeddings for Visual Question Answering

Conceptual diagram of our approach. We learn two em- bedding functions to transform image question pair (i, q) and (pos- sible) answer a...

Learning Conceptual-Contextual Embeddings for Medical Text

tion model called Conceptual-Contextual (CC) embeddings, ... sentation into a specific task, like question answering (Hao.

Learning Conceptual Embeddings for Words using Context

This dataset consists of 8,869 se- mantic and 10,675 syntactic queries. Each query is a tuple of four words (A, B, C, D)...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Reproducing msmarco-distilbert-dot-v5 training

[CLIP] 'clip-ViT-B-32' can we not change the max_seq_lenght?