Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine-tune optimal hyperparameters for small datasets?

See original GitHub issue

I’m trying to improve the accuracy of the ASR model for a specific domain.

The base model it’s a fine-tuned model from English (QuartzNet) to my language, and now I’m trying to improve this base model on a specific dataset.

The dataset has around 12h, and I used speed perturbation to generate around 37h.

I use these hyperparameters by the loss seems to stay the same after a significant number of epochs.

new_optimizer = {
        'name': 'novograd',
        'lr': 0.0001,
        'betas': (0.95, 0.25),
        'weight_decay': 0.001,
        'sched': {
            'last_epoch': -1,
            'min_lr': 0.0,
            'monitor': 'val_loss',
            'name': 'CosineAnnealing',
            'reduce_on_plateau': False,
            'warmup_ratio': 0.12,
            'warmup_steps': None,
        },
    }

I use 500 epochs for this training.

Do you have any advice on how I can optimally improve the accuracy of this small dataset, or how is a minimum number of hours that can generate a good improvement?

Issue Analytics

State:
Created 2 years ago
Comments:6

Top GitHub Comments

1reaction

titu1994commented, Nov 22, 2021

If you dataset is so small, it’s best to freeze the encoder then unfreeze the batch norm. Irrespective of same or different language.

0reactions

adiIspascommented, Nov 22, 2021

Another short question @titu1994.

Batch normalization has to been unfrozen if the language remains the same, and I perform some specific domain fine-tune?

Top Results From Across the Web

deep learning - What are the good parameter ranges for BERT ...

I have 12K sentences and only 10% of them are from positive classes. Does anyone here have any experience on finetuning bert in...

Rethinking the Hyperparameters for Fine-tuning - OpenReview

This paper re-examines several common practices of setting hyper-parameters for fine-tuning and identify optimal hyperparameter depends on source-target ...

Hyperparameter optimization for fine-tuning pre-trained ...

For smaller NLP datasets, a simple yet effective strategy is to use a pre-trained transformer, usually trained in an unsupervised fashion on ...

Random Forest: Hyperparameters and how to fine-tune them

The most important hyper-parameters of a Random Forest that can be ... dataset to train each tree, if you have a small number...

Tune Hyperparameters for Classification Machine Learning ...

Machine learning algorithms have hyperparameters that allow you to tailor the behavior of the algorithm to your specific dataset.