question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Neural vocoder support

See original GitHub issue

Let’s discuss neural vocoder support.

  1. Types
  • Parallel WaveGAN (PWG), MelGAN I can easily implement these two since I am familiar with it. Kan-bayashi has packed his repo so we can simply pip install -U parallel_wavegan.
  • WaveNet vocoder I don’t really prefer this one due to its slow inference speed.
  • WaveGlow I don’t really prefer this one due to its slow training speed.
  1. Usage & Structure We can add a synthesis stage to the recipe. We can provide pretrained models for users to download, and use an argument like voc_expdir to load the pretrained model. In addition, with PWG, kan-bayashi has also packed training code in the package, so we can provide recipes for users to train their own vocoders if they want. One example design can be like egs/pwg/vcc2018.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
unilightcommented, Jun 3, 2020

I see. I will train vocoders in kan-bayashi/ParallelWaveGAN and just provide pretrained model links in this repo. I will work on it next.

1reaction
k2kobayashicommented, Jun 2, 2020

It’s nice to integrate PWG/MelGAN. Actually, I have already discuss it with @kan-bayashi. He said he can create recipe and pre-trained model after releasing vcc2020 dataset.

For the structure, I think following is nice.

  • add stage 6 in egs/vaevc/template/run.sh
  • implement egs/vaevc/<recipe>/local/download_pretrained_neuralvocoder.sh to download pre-trained models.
  • implement crank/bin/generate_wav_{pwg,melgan}.py to generate wav file w/ pre-trained model and generated h5 in stage 5.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Neural Vocoder is All You Need for Speech Super-resolution
In this paper, we propose a neural vocoder based speech super-resolution method (NVSR) that can handle a variety of input resolution and ...
Read more >
Neural Vocoding for Singing and Speaking Voices with ... - MDPI
In the long term, this work aims to develop a neural vocoder supporting perceptually transparent analysis/resynthesis for arbitrary speakers and arbitrary voice ...
Read more >
Support for neural vocoders (NSF, Parallel WaveGAN) #18
NNSVS now supports Parallel WaveGAN and NSF. Is it possible to use these vocoders on ENUNU?
Read more >
Neural TTS - Amazon Polly - AWS Documentation
The following features are supported for neural voices: Real-time and asynchronous speech synthesis operations. Newscaster speaking style. For more information ...
Read more >
Azure Neural TTS voices upgraded to 48kHz with HiFiNet2 ...
Thanks to the latest innovation on our HiFiNet vocoder, ... Custom Neural Voice can support Lite projects (CNV Lite,) with which customers ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found