question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine-tuning tips with loss functions & evaluators

See original GitHub issue

Hi,

Before my question, i’d like to thank you for open sourcing your awesome work to the community.

Context: I’m working on continue-training on the SentenceTransformer ('bert-base-nli-mean-tokens') with my customized data. I am following the sample training_stsbenchmark_continue_training.py you provided. And I use my own data to construct a NLI version training data. That {(s1, s2), label}, where the labels were mapped to {"entailment": 1, "neutral": 2}, i do not have the {"contradiction": 0} case.

I looked at the tutorial script for continue training. You took the STS data in the example, which the label is not the same as the classification labels with NLI. The loss is cosinesimilarityloss, and evaluator is EmbeddingSimilarityEvaluator.

Question: Is it possible to continue train the 'bert-base-nli-mean-tokens' model with NLI style training data? If so, for the classification task, which loss function and evaluator would you recommend here? My customized training data has about 500000 training instances, for continue-training, what many epochs is good?

Thank you in advance.

Best, Hetian

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
nreimerscommented, Apr 30, 2020

You have to pass the SoftMax Loss model to the evaluator:

evaluator = LabelAccuracyEvaluator(dev_dataloader, softmax_model=train_loss)

This should work

1reaction
nreimerscommented, Apr 29, 2020

Hi Hetian, for NLI style training data have a look at the training_nli.py example. There, you just need to change how the model is constructed. There are two ways how you can build a sentence embedding model.

Option 1: Take the different models and stick them together. I.e., you start with a BERT / Transformer model and then add a Pooling layer. This is done in training_nli.py

Option 2: You take an already build sentence transformer model. This model is loaded via SentenceTransformer(‘bert-base-nli-mean-tokens’). In the background, it downloads the fine-tuned BERT model and the config for the pooling layer and loads it as in Option 1.

To continue training, you just have to change the training_nli.py such that instead of creating your model from scratch from BERT, you just load the model with: model = SentenceTransformer(‘bert-base-nli-mean-tokens’)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training and fine-tuning — transformers 3.1.0 documentation
In this quickstart, we will show how to fine-tune (or train from scratch) a model using the standard training tools available in either...
Read more >
Advanced Techniques for Fine-tuning Transformers
1. Layer-wise Learning Rate Decay (LLRD) · 2. Warm-up Steps · 3. Re-initializing Pre-trained Layers · 4. Stochastic Weight Averaging (SWA) · 5....
Read more >
Fine-Tuning Transformers for NLP - AssemblyAI
In this blog, we show you how to quickly fine-tune Transformers for downstream NLP ... we will use the binary cross entropy loss...
Read more >
Tutorial: Fine-tuning BERT for Sentiment Analysis - by Skim AI
Update the learning rate ( scheduler.step() ). Evaluation: Unpack our data and load onto the GPU; Forward pass; Compute loss and accuracy rate...
Read more >
Fine-tuning with Keras and Deep Learning - PyImageSearch
Training models on top of extracted features; Fine-tuning networks on your own custom datasets; My personal tips, suggestions, and best ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found