Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RIR and noise augmentation for ASR Quartznet

See original GitHub issue

Hi all, I’m still having trouble implementing RIR in my ASR model after i solved this problem. Let’s assume i have alredy created manifest file for both noise and rir aug. Let’s assume also that i want to implement augmentation with implicit definition, as suggested in tutorial n.5. With this in mind, i created the nested dictionary:

audio_augmentations = dict(
        rir_noise_aug = dict(
        prob=0.5,
        rir_manifest_path = 'rir_manifest.json',
        rir_prob = 0.5,
        noise_manifest_paths = 'noise_manifest.json',
        min_snr_db = [0,0],
        max_snr_db = [30,30],
        bg_noise_manifest_paths = 'noise_manifest.json',
        bg_min_snr_db = [10,10],
        bg_max_snr_db = [40,40]
    )
)

And supplied the ‘augmentor’ to the model.train_ds config:

config.model.train_ds.augmentor = audio_augmentations

But when i go for:

asr_model.setup_training_data(train_data_config=config.model.train_ds)

It returns an error:

TypeError                                 Traceback (most recent call last)
<ipython-input-29-e364e6754a27> in <module>
----> 1 asr_model.setup_training_data(train_data_config=config.model.train_ds)
      2 asr_model.setup_validation_data(val_data_config=config.model.validation_ds)
      3 asr_model.set_trainer(trainer)
      4 asr_model.setup_optimization(optim_config=config.model.optim)

/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/models/ctc_models.py in setup_training_data(self, train_data_config)
    326         self._update_dataset_config(dataset_name='train', config=train_data_config)
    327 
--> 328         self._train_dl = self._setup_dataloader_from_config(config=train_data_config)
    329 
    330         # Need to set this because if using an IterableDataset, the length of the dataloader is the total number

/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/models/ctc_models.py in _setup_dataloader_from_config(self, config)
    224     def _setup_dataloader_from_config(self, config: Optional[Dict]):
    225         if 'augmentor' in config:
--> 226             augmentor = process_augmentations(config['augmentor'])
    227         else:
    228             augmentor = None

/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/parts/perturb.py in process_augmentations(augmenter)
    753 
    754             try:
--> 755                 augmentation = perturbation_types[augment_name](**augment_kwargs)
    756                 augmentations.append([prob, augmentation])
    757             except KeyError:

/opt/conda/lib/python3.6/site-packages/nemo/collections/asr/parts/perturb.py in __init__(self, rir_manifest_path, rir_prob, noise_manifest_paths, min_snr_db, max_snr_db, rir_tar_filepaths, rir_shuffle_n, noise_tar_filepaths, apply_noise_rir, orig_sample_rate, max_additions, max_duration, bg_noise_manifest_paths, bg_min_snr_db, bg_max_snr_db, bg_noise_tar_filepaths, bg_orig_sample_rate)
    508                     min_snr_db=min_snr_db[i],
    509                     max_snr_db=max_snr_db[i],
--> 510                     audio_tar_filepaths=noise_tar_filepaths[i],
    511                     orig_sr=orig_sr,
    512                 )

TypeError: 'NoneType' object is not subscriptable

Actually i can’t figure it out where is the problem. I tried to search in perturb.py but i guess i’am too naive to find the solution. It seems the problem relies in noise_tar_filepath. But my rir and noise are not tarred!! Thanks in advance

Issue Analytics

State:
Created 3 years ago
Comments:9

Top GitHub Comments

2reactions

jbalam-nvcommented, Feb 3, 2021

@lodm94 You are right, there seems to be an issue with the default value of noise_tar_paths. Until we push a fix, as a workaround you could change your dict to the following:

audio_augmentations = dict( rir_noise_aug = dict( prob=0.5, rir_manifest_path = ‘rir_manifest.json’, rir_prob = 0.5, noise_manifest_paths = [‘noise_manifest.json’], min_snr_db = [0], max_snr_db = [30], bg_noise_manifest_paths = [‘noise_manifest.json’], bg_min_snr_db = [10], bg_max_snr_db = [40], noise_tar_filepaths=[None], bg_noise_tar_filepaths=[None] ) )

0reactions

titu1994commented, Nov 9, 2022

Do not reopen discussing on a months old thread. Create a new one.

Top Results From Across the Web

arXiv:2010.12715v1 [eess.AS] 23 Oct 2020

Demonstrates that a single ASR model trained with large and diverse speech dataset and then fine-tuned with noise augmentation performs well ...

Source code for nemo.collections.asr.parts.perturb

[docs]class RirAndNoisePerturbation(Perturbation): """ RIR augmentation with additive foreground and background noise. In this implementation audio data is ...

Improving Noise Robustness of an End-to-End Neural Model ...

We present our experiments in training robust to noise an end-to-end automatic speech recognition (ASR) model using intensive data ...

Room Impulse Response and Noise Database - openslr.org

This data includes all the room impulse responses (RIRs) and noises we used in our paper "A Study on Data Augmentation of Reverberant...

A single speaker is almost all you need for automatic speech ...

The ASR model trained with synthesized speech combined with human speech ... three popular augmentation methods in speech processing – additive noise, ...