Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question about training with BatchHardTripletLoss

See original GitHub issue

Maybe it is a naive questions (as I am not native to pyTorch).

When training as an example shown here (to the above loss mentioned):


model = SentenceTransformer('distilbert-base-nli-mean-tokens')
train_examples = [InputExample(texts=['Sentence from class 0'], label=0), InputExample(texts=['Another sentence from class 0'], label=0),
    InputExample(texts=['Sentence from class 1'], label=1), InputExample(texts=['Sentence from class 2'], label=2)]
train_dataset = SentencesDataset(train_examples, model)
train_dataloader = DataLoader(train_dataset, shuffle=True, batch_size=train_batch_size)
train_loss = losses.BatchSemiHardTripletLoss(model=model)

how is there a siamese model trained where I have two inputs? Because you are using a SentenceTransaformer (which maps single input to output). Also in your bi-encoder example you build a sentence transformer from scratch. I just wonder how training in a siamese manner happens?

In my understanding SentenceTranformer is a siamese bi-encoder (like in your paper).

Otherwise in your Quora example: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/quora_duplicate_questions/training_multi-task-learning.py

also SentenceTransformer model is trained and gets the 2 inputs for sentences pairs. I wonder where and when the model “knows” how to fit depending on the number of inputs? I feel I miss something. When is “siamese” model trained and when a “single” model with 1 input?

Issue Analytics

State:
Created 3 years ago
Comments:23 (9 by maintainers)

Top GitHub Comments

1reaction

nreimerscommented, Dec 22, 2020

@datistiquo All the losses use single networks / models: The inputs are passed through the same network (same object) for all the cases.

The only differences are how the losses are computed. And how to compute the loss depends on your available training data and the properties these have. So based on what labeled data you have, you have to choose the right loss.

BatchHard generates the triplets online (as described in the above blog post). So no need to generate triplets yourself, the loss will look into the batch and create all possible triplets from it.

For evaluation, however, we want to see how well it works for specific triplets. So we create some fixed triplets and evaluate the model on it.

0reactions

nreimerscommented, Apr 12, 2021

https://stackoverflow.com/questions/11218477/how-can-i-use-pickle-to-save-a-dict

Works also with any other data type in python

Top Results From Across the Web

Losses — Sentence-Transformers documentation

BatchHardTripletLoss takes a batch with (label, sentence) pairs and computes the loss for all possible, valid triplets, i.e., anchor and positive must have...

Train and Fine-Tune Sentence Transformers Models

Training or fine-tuning a Sentence Transformers model highly depends on the available data and the target task. The key is twofold:.

Training-log of LuNet on Market1501 using the batch hard ...

Training -log of LuNet on Market1501 using the batch hard triplet loss with margin 0.2. The embeddings stay bounded, as expected from a...

In training a triplet network, I first have a solid drop in loss, but ...

In summary, the authors make several suggestions, but we need to motivate them. Let's start by defining the problem. The goal of triplet...

arXiv:1703.07737v4 [cs.CV] 21 Nov 2017

Figure 6: Training-log of LuNet on Market1501 using the batch hard triplet loss with soft margin. The embeddings keep moving apart as even...