Generated voice is unclear and noisy
See original GitHub issueI’ve trained the model for Cantonese, using (https://github.com/Jackiexiao/MTTS) frontend with modification for Cantonese(https://github.com/mirfan899/MTTS). Model is trained and wav
files are generated. But audio is noisy and unclear. I’ve attached the logs for the reference.
output.log
and generated audio sample.
ASR1.wav.zip
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Four of the Most Common Synthetic Speech Problems and ...
Some of the most common issues here are noises, clicks, and other sound artifacts that shouldn't be present in generated speech. In fact,...
Read more >Audio Distortion: Finding the Source and Clearing the Air
Think of it like this, if you scream into a microphone and the mic can't handle the loud volume, then the audio signal...
Read more >How To Clean Up Noisy Audio In Under A Minute - YouTube
https://shutr.bz/2FhDmMg - Head to the blog for step-by-step instructions on how to clean up your noisy audio in premiere.
Read more >Effects of noise on speech production: Acoustic and ... - NCBI
The present results demonstrated several clear differences in the acoustic characteristics of speech produced in quiet compared to speech produced in noise.
Read more >How to Improve Audio Quality with Audacity - LearnUpon
Use a good quality microphone. · Choose your place of recording carefully, insulated from street noise etc. · Turn off anything in the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That’s probably the issue then. I’d recommend at least 1000 sentences. Preferably a lot more for high quality synthesis.
Currently 220 audios.