Fine-tune optimal hyperparameters for small datasets?
See original GitHub issueI’m trying to improve the accuracy of the ASR model for a specific domain.
The base model it’s a fine-tuned model from English (QuartzNet) to my language, and now I’m trying to improve this base model on a specific dataset.
The dataset has around 12h, and I used speed perturbation to generate around 37h.
I use these hyperparameters by the loss seems to stay the same after a significant number of epochs.
new_optimizer = {
'name': 'novograd',
'lr': 0.0001,
'betas': (0.95, 0.25),
'weight_decay': 0.001,
'sched': {
'last_epoch': -1,
'min_lr': 0.0,
'monitor': 'val_loss',
'name': 'CosineAnnealing',
'reduce_on_plateau': False,
'warmup_ratio': 0.12,
'warmup_steps': None,
},
}
I use 500 epochs for this training.
Do you have any advice on how I can optimally improve the accuracy of this small dataset, or how is a minimum number of hours that can generate a good improvement?
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top Results From Across the Web
deep learning - What are the good parameter ranges for BERT ...
I have 12K sentences and only 10% of them are from positive classes. Does anyone here have any experience on finetuning bert in...
Read more >Rethinking the Hyperparameters for Fine-tuning - OpenReview
This paper re-examines several common practices of setting hyper-parameters for fine-tuning and identify optimal hyperparameter depends on source-target ...
Read more >Hyperparameter optimization for fine-tuning pre-trained ...
For smaller NLP datasets, a simple yet effective strategy is to use a pre-trained transformer, usually trained in an unsupervised fashion on ...
Read more >Random Forest: Hyperparameters and how to fine-tune them
The most important hyper-parameters of a Random Forest that can be ... dataset to train each tree, if you have a small number...
Read more >Tune Hyperparameters for Classification Machine Learning ...
Machine learning algorithms have hyperparameters that allow you to tailor the behavior of the algorithm to your specific dataset.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
If you dataset is so small, it’s best to freeze the encoder then unfreeze the batch norm. Irrespective of same or different language.
Another short question @titu1994.
Batch normalization has to been unfrozen if the language remains the same, and I perform some specific domain fine-tune?