Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

suggestion: adding a more detailed data pipeline instruction on fine-tuning colab tutorial

See original GitHub issue

It’s been great to see more tutorials being added to learn speechbrain. For this old tutorial Pretrained Models and Fine-Tuning with huggface, I notice there is slight inconsistency between the loading data procedure to use pre-trained model and to fine-tune the model.

In using pre-trained model to transcribe the file, the audio is loaded using the EncoderDecoderASR interface, which will 1) load the audio using torchaudio 2) normalize the audio using an audio normalizer.

In the fine-tuning section, the audio is loaded using dataset.add_dynamic_item(sb.dataio.dataio.read_audio, takes="file_path", provides="signal"), and no normalizer is used within the function.

Since the decoding relies on normalized audio, wouldn’t it be more natural to fine-tune the model based on normalized audio rather than raw audio?

It might also be useful to add a section about how to use the fine-tuned model for transcribing file.

Issue Analytics

State:
Created 2 years ago
Comments:6

Top GitHub Comments

1reaction

popcornellcommented, May 5, 2021

I am going to fix this today, thank you @ziz19 for pointing this out

0reactions

ziz19commented, May 7, 2021

@popcornell Thank you for much for the explanation. Github username is perfectly fine! Speechbrain is so much easier to use compared to other frameworks. Thank you all for the hard work! I’m keep using Speechbrain now and in the future.