Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to train CosineSimilarityLoss without labels?

See original GitHub issue

I’d like to rebuild a model from a certain paper that minimizes 1 - CosineSimilarity(a, b), but it seems to me that the implementation of CosineSimilarityLoss in this repo and the usage in the examples always require a label (i.e. -1 or 1). Is it possible to train the CosineSimilarityLoss loss without labels, providing only positive/similar example pairs?

Issue Analytics

State:
Created 3 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

2reactions

nreimerscommented, Jul 27, 2020

Hi @olastor The CosineSimilarityLoss requires a continous label, e.g. from -1 to 1 or from 0 to 1.

If you have only positive pairs, you can construct negative pairs by randomly selecting two pairs. This usually gives you a negative pair. Then you can train with ConstrativeLoss.

Else have a look at triplet loss.

Best Nils Reimers

0reactions

datistiquocommented, Nov 4, 2020

I think if you have labels 0 and 1, then cosine sim schould also work.

But this is eaxatcly what you also do in the paper, right? So I am a bit unsure what your “should work” means? Maybe you use the gold scores a s “labels”?

From the paper:

Regression Objective Function. The cosinesimilarity between the two sentence embeddings u and v is computed (Figure 2). We use meansquared-error loss as the objective function.

Top Results From Across the Web

Losses — Sentence-Transformers documentation

BatchAllTripletLoss takes a batch with (label, sentence) pairs and computes the loss for all possible, valid triplets, i.e., anchor and positive must have ......

Train and Fine-Tune Sentence Transformers Models

Note that Sentence Transformers models can be trained with human labeling (cases 1 and 3) or with labels automatically deduced from text ...

How to apply cosine similarity loss function in unsupervised ...

The loss requires y_true and y_pred , but as can be seen, this is unsupervised training and there is no y_true .

How to apply cosine similarity loss function in unsupervised ...

The loss requires y_true and y_pred , but as can be seen, this is unsupervised training and there is no y_true .

No means 'No': a non-improper modeling approach, with ...

We used the architecture of sentence-BERT implemented by Reimers and Gurevych (2019) which used cosine similarity loss for training.