RIR and noise augmentation for ASR Quartznet
See original GitHub issueHi all, I’m still having trouble implementing RIR in my ASR model after i solved this problem. Let’s assume i have alredy created manifest file for both noise and rir aug. Let’s assume also that i want to implement augmentation with implicit definition, as suggested in tutorial n.5. With this in mind, i created the nested dictionary:
audio_augmentations = dict(
rir_noise_aug = dict(
prob=0.5,
rir_manifest_path = 'rir_manifest.json',
rir_prob = 0.5,
noise_manifest_paths = 'noise_manifest.json',
min_snr_db = [0,0],
max_snr_db = [30,30],
bg_noise_manifest_paths = 'noise_manifest.json',
bg_min_snr_db = [10,10],
bg_max_snr_db = [40,40]
)
)
And supplied the ‘augmentor’ to the model.train_ds config:
config.model.train_ds.augmentor = audio_augmentations
But when i go for:
asr_model.setup_training_data(train_data_config=config.model.train_ds)
It returns an error:
TypeError Traceback (most recent call last)
<ipython-input-29-e364e6754a27> in <module>
----> 1 asr_model.setup_training_data(train_data_config=config.model.train_ds)
2 asr_model.setup_validation_data(val_data_config=config.model.validation_ds)
3 asr_model.set_trainer(trainer)
4 asr_model.setup_optimization(optim_config=config.model.optim)
/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/models/ctc_models.py in setup_training_data(self, train_data_config)
326 self._update_dataset_config(dataset_name='train', config=train_data_config)
327
--> 328 self._train_dl = self._setup_dataloader_from_config(config=train_data_config)
329
330 # Need to set this because if using an IterableDataset, the length of the dataloader is the total number
/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/models/ctc_models.py in _setup_dataloader_from_config(self, config)
224 def _setup_dataloader_from_config(self, config: Optional[Dict]):
225 if 'augmentor' in config:
--> 226 augmentor = process_augmentations(config['augmentor'])
227 else:
228 augmentor = None
/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/parts/perturb.py in process_augmentations(augmenter)
753
754 try:
--> 755 augmentation = perturbation_types[augment_name](**augment_kwargs)
756 augmentations.append([prob, augmentation])
757 except KeyError:
/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/parts/perturb.py in __init__(self, rir_manifest_path, rir_prob, noise_manifest_paths, min_snr_db, max_snr_db, rir_tar_filepaths, rir_shuffle_n, noise_tar_filepaths, apply_noise_rir, orig_sample_rate, max_additions, max_duration, bg_noise_manifest_paths, bg_min_snr_db, bg_max_snr_db, bg_noise_tar_filepaths, bg_orig_sample_rate)
508 min_snr_db=min_snr_db[i],
509 max_snr_db=max_snr_db[i],
--> 510 audio_tar_filepaths=noise_tar_filepaths[i],
511 orig_sr=orig_sr,
512 )
TypeError: 'NoneType' object is not subscriptable
Actually i can’t figure it out where is the problem. I tried to search in perturb.py but i guess i’am too naive to find the solution. It seems the problem relies in noise_tar_filepath. But my rir and noise are not tarred!! Thanks in advance
Issue Analytics
- State:
- Created 3 years ago
- Comments:9
@lodm94 You are right, there seems to be an issue with the default value of noise_tar_paths. Until we push a fix, as a workaround you could change your dict to the following:
audio_augmentations = dict( rir_noise_aug = dict( prob=0.5, rir_manifest_path = ‘rir_manifest.json’, rir_prob = 0.5, noise_manifest_paths = [‘noise_manifest.json’], min_snr_db = [0], max_snr_db = [30], bg_noise_manifest_paths = [‘noise_manifest.json’], bg_min_snr_db = [10], bg_max_snr_db = [40], noise_tar_filepaths=[None], bg_noise_tar_filepaths=[None] ) )
Do not reopen discussing on a months old thread. Create a new one.