Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine-tune stt conformer transducer EncDecRNNTBPEModel model

See original GitHub issue

I’m trying to fine-tine stt_en_conformer_transducer_large model on a custom dataset of 300H. I have a problem with setting up the training configuration.

In this example you reported how to train a model from scratch, but I understood from the model card (https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_transducer_small) that we can fine-tune it.

I tried to load the model, extract the config file, modify it and re-assign it to the mode, but did not work.

Here is what I did:

Loading the model

import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="stt_en_conformer_transducer_large", map_location='cpu')

Loading the model configurations:

import copy
from omegaconf import DictConfig

modelConfig = DictConfig(asr_model.cfg)
modelConfig

Set the training data:

modelConfig['train_ds']['manifest_filepath'] = TRAIN_MANIFEST
modelConfig['validation_ds']['manifest_filepath'] = TEST_MANIFEST

Assigned it to the model: asr_model.setup_training_data(modelConfig['train_ds'])

Here is the error I get:

[NeMo I 2022-01-18 18:44:40 collections:173] Dataset loaded with 1503 files totalling 2.81 hours
[NeMo I 2022-01-18 18:44:40 collections:174] 0 files were filtered totalling 0.00 hours
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_3295870/3465286351.py in <module>
----> 1 asr_model.setup_training_data(modelConfig['train_ds'])

~/anaconda3/envs/nemo/lib/python3.7/site-packages/nemo/collections/asr/models/rnnt_models.py in setup_training_data(self, train_data_config)
    499             # If it's an int, we assume that the user has set it to something sane, i.e. <= # training batches,
    500             # and don't change it. Otherwise, adjust batches accordingly if it's a float (including 1.0).
--> 501             if isinstance(self._trainer.limit_train_batches, float):
    502                 self._trainer.limit_train_batches = int(
    503                     self._trainer.limit_train_batches

AttributeError: 'NoneType' object has no attribute 'limit_train_batches'

Thanks!

Issue Analytics

State:
Created 2 years ago
Comments:10

Top GitHub Comments

1reaction

Omarnabkcommented, Jan 18, 2022

I did that test on a small dataset of 3H. However, my complete dataset is around 250H with different English accents. So, I will try to re-train on the full dataset and see how it goes.

1reaction

titu1994commented, Jan 18, 2022

Otherwise seems fine. You need to train a lot longer though than 2 epochs.

Top Results From Across the Web

STT Eo Conformer-Transducer Large - NVIDIA NGC

This collection contains a large size versions of Conformer-Transducer (around 120M parameters) model that were obtained by finetuning from ...

nvidia/stt_en_conformer_transducer_xlarge - Hugging Face

This model transcribes speech in lower case English alphabet along with spaces and apostrophes. It is an "extra-large" versions of Conformer-Transducer ...

arXiv:2010.13956v2 [eess.AS] 29 Oct 2020

recognition (ASR), speech translations (ST), speech separation (SS) ... Our Conformer model consists of a Conformer encoder proposed.

NVIDIA NeMo Offline Speech Translation Systems for IWSLT ...

the limited amount of direct speech translation (ST) data, we mostly focused on building a ... ASR model with Conformer (Gulati et al.,....

ASR-with-Transducers.ipynb - Colaboratory - Google Colab

This notebook is a basic tutorial for creating a Transducer ASR model and ... we can leverage a modern encoder architecture like ContextNet...