question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine tuning using ALBERT

See original GitHub issue

I have gone through older issues and @nreimers has pointed out many times that ALBERT model does not perform quite good with sentence-transformers. I am absolutely fine with ~5-10 points less performance than BERT but after training ALBERT for 1 epoch on AllNLI dataset I got awful results.

ALBERT-large-V1 2020-06-08 18:20:28 - Cosine-Similarity : Pearson: 0.1973 Spearman: 0.2404 2020-06-08 18:20:28 - Manhattan-Distance: Pearson: 0.2318 Spearman: 0.2411 2020-06-08 18:20:28 - Euclidean-Distance: Pearson: 0.2313 Spearman: 0.2408 2020-06-08 18:20:28 - Dot-Product-Similarity: Pearson: 0.1437 Spearman: 0.1551

ALBERT-large-V2 2020-06-09 03:58:27 - Cosine-Similarity : Pearson: 0.0722 Spearman: 0.0633 2020-06-09 03:58:27 - Manhattan-Distance: Pearson: 0.1236 Spearman: 0.1089 2020-06-09 03:58:27 - Euclidean-Distance: Pearson: 0.1237 Spearman: 0.1090 2020-06-09 03:58:27 - Dot-Product-Similarity: Pearson: 0.1047 Spearman: 0.0900

I am using all default parameters mentioned in training script. python /content/sentence-transformers/examples/training_transformers/training_nli.py 'albert-large-v1'

I checked similarity_evaluation_results file after fine-tuning. For ALBERT-large-V2 all values for cosine_pearson are nan and for ALBERT-large-V1 after initial increase in value to 0.24 there is stagnation.

It takes ~8 hrs on Google colab to fine tune ALBERT on AllNLI dataset. Any pointers to get at least respectable results? I am doing anything wrong here?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
knokcommented, Feb 3, 2021

Just FYI [2101.10642v1] Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks According to the paper, CNN based structure instead of average pooling is more good performance with ALBERT.

0reactions
Akshayextremecommented, Jun 9, 2020

ALBERT-base-V2 Fine-tuned on STSb for 4 epochs

2020-06-09 15:15:07 - Cosine-Similarity : Pearson: 0.7880 Spearman: 0.7861 2020-06-09 15:15:07 - Manhattan-Distance: Pearson: 0.7558 Spearman: 0.7592 2020-06-09 15:15:07 - Euclidean-Distance: Pearson: 0.7634 Spearman: 0.7657 2020-06-09 15:15:07 - Dot-Product-Similarity: Pearson: 0.7393 Spearman: 0.7338

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fine-tune Albert with Pre-training on Custom Corpus
This post illustrates the simple steps to pre-train the state of art Albert[1] NLP model on a custom corpus and further fine-tune the ......
Read more >
albert/README.md at master · google-research/albert - GitHub
To fine-tune and evaluate a pretrained ALBERT on GLUE, please see the convenience script run_glue.sh . Lower-level use cases may want to use...
Read more >
ArBert/albert-base-v2-finetuned-ner - Hugging Face
This model is a fine-tuned version of albert-base-v2 on the conll2003 dataset. It achieves the following results on the evaluation set:.
Read more >
Fine-tune ALBERT for sentence-pair classification
You should be able to reach on the validation set 91.19 as F1 score (the score reported in the ALBERT paper is 90.9)...
Read more >
ALBERT-based fine-tuning model for cyberbullying analysis
This coupled with the fact that ALBERT is pre-trained on a large corpus allowing the flexibility to use a smaller dataset for fine-tuning...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found