Fine-tune stt conformer transducer EncDecRNNTBPEModel model
See original GitHub issueI’m trying to fine-tine stt_en_conformer_transducer_large model on a custom dataset of 300H. I have a problem with setting up the training configuration.
In this example you reported how to train a model from scratch, but I understood from the model card (https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_transducer_small) that we can fine-tune it.
I tried to load the model, extract the config file, modify it and re-assign it to the mode, but did not work.
Here is what I did:
Loading the model
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="stt_en_conformer_transducer_large", map_location='cpu')
Loading the model configurations:
import copy
from omegaconf import DictConfig
modelConfig = DictConfig(asr_model.cfg)
modelConfig
Set the training data:
modelConfig['train_ds']['manifest_filepath'] = TRAIN_MANIFEST
modelConfig['validation_ds']['manifest_filepath'] = TEST_MANIFEST
Assigned it to the model:
asr_model.setup_training_data(modelConfig['train_ds'])
Here is the error I get:
[NeMo I 2022-01-18 18:44:40 collections:173] Dataset loaded with 1503 files totalling 2.81 hours
[NeMo I 2022-01-18 18:44:40 collections:174] 0 files were filtered totalling 0.00 hours
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_3295870/3465286351.py in <module>
----> 1 asr_model.setup_training_data(modelConfig['train_ds'])
~/anaconda3/envs/nemo/lib/python3.7/site-packages/nemo/collections/asr/models/rnnt_models.py in setup_training_data(self, train_data_config)
499 # If it's an int, we assume that the user has set it to something sane, i.e. <= # training batches,
500 # and don't change it. Otherwise, adjust batches accordingly if it's a float (including 1.0).
--> 501 if isinstance(self._trainer.limit_train_batches, float):
502 self._trainer.limit_train_batches = int(
503 self._trainer.limit_train_batches
AttributeError: 'NoneType' object has no attribute 'limit_train_batches'
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:10
Top Results From Across the Web
STT Eo Conformer-Transducer Large - NVIDIA NGC
This collection contains a large size versions of Conformer-Transducer (around 120M parameters) model that were obtained by finetuning from ...
Read more >nvidia/stt_en_conformer_transducer_xlarge - Hugging Face
This model transcribes speech in lower case English alphabet along with spaces and apostrophes. It is an "extra-large" versions of Conformer-Transducer ...
Read more >arXiv:2010.13956v2 [eess.AS] 29 Oct 2020
recognition (ASR), speech translations (ST), speech separation (SS) ... Our Conformer model consists of a Conformer encoder proposed.
Read more >NVIDIA NeMo Offline Speech Translation Systems for IWSLT ...
the limited amount of direct speech translation (ST) data, we mostly focused on building a ... ASR model with Conformer (Gulati et al.,....
Read more >ASR-with-Transducers.ipynb - Colaboratory - Google Colab
This notebook is a basic tutorial for creating a Transducer ASR model and ... we can leverage a modern encoder architecture like ContextNet...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I did that test on a small dataset of 3H. However, my complete dataset is around 250H with different English accents. So, I will try to re-train on the full dataset and see how it goes.
Otherwise seems fine. You need to train a lot longer though than 2 epochs.