question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine-tuning procedure for mb_melgan vocoder, Voice Quality degrading with Fine-tuning.

See original GitHub issue

Hello! I’m trying to finetune a Vocoder model for Indian accents. I’ve followed the suggestions from thread #296 and have arrived at a suitable acoustics model.

To improve the output voice quality of the present vocoder( multiband_melgan.v1) model, I had followed the finetuning process mentioned in examples/multiband_melgan with 940000.h5 multiband_melgan.v1-EN as the pretrained model.

However the output has degraded(completely muffled speech) compared to the pretrained vocoder.

I had used the same dataset as the one used for fastspeech model training, with this command,

python ./examples/multiband_melgan/train_multiband_melgan.py \
--train-dir ./dump/train/ \
--dev-dir ./dump/valid/ \
--outdir ./examples/multiband_melgan/exp/train.multiband_melgan.v1/ \
--config ./examples/multiband_melgan/conf/multiband_melgan.v1.yaml \
--use-norm 1 \
--pretrained mb_melgan_generator.h5 

These are the loss plots that I obtained, eval train

Please help me debug this problem, thank you

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
WadoodAbdulcommented, Jul 7, 2021

Thanks for the clarification @dathudeptrai 😃

1reaction
dathudeptraicommented, Jul 7, 2021

Thanks for the help, @dathudeptrai. I Had not continued after 200k steps.

Just for clarification, I have to tune the Generator+Discriminator for the rest of the steps(1M), not just the generator for 1M steps. Is that correct?

yes 😄

Read more comments on GitHub >

github_iconTop Results From Across the Web

arXiv:2110.05798v2 [cs.SD] 6 Apr 2022
For finetuning the spectrogram-synthesis model, we require the text and speech pairs of the new speaker, while for the vocoder model, we only ......
Read more >
Fine-Tune Whisper For Multilingual ASR with Transformers
In this blog, we present a step-by-step guide on fine-tuning Whisper for any multilingual ASR dataset using Hugging Face Transformers.
Read more >
Fine-tuning pre-trained voice conversion model for adding ...
This paper proposes a training method for a vocoder-free any-to-many encoder-decoder VC model with limited data. Various pretraining techniques have been ...
Read more >
Fine-tuning a TTS model - TTS 0.10.0 documentation
Fine-tuning takes a pre-trained model, and retrains it to improve the model performance on a ... Some models are fast and some are...
Read more >
Creating Speech Synthesis Process by Using Pulse Model in ...
the speech synthesis process, but the vocoder is combined only the ... the quality degrading process is reducing the speech synthesis.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found