tts_models/en/ljspeech/tacotron2-DDC model does not stop decoding if there is no punctuation at the end of the sentence.
See original GitHub issueDescribe the bug In some cases the spoken text is messed up.
To Reproduce Steps to reproduce the behavior:
- Run the following command
tts-server --model_name tts_models/en/ljspeech/tacotron2-DDC --vocoder_name vocoder_models/en/ljspeech/hifigan_v2
- open webinterface and generate audio http://127.0.0.1:5002/
- text and audio examples:
banana
https://soundcloud.com/davidak-de/ai-with-tourette-syndrome-struggles-to-say-banana
popular tourette youtuber tells the banana tic story. it’s pretty close 😄 https://youtu.be/Q5MrVcpq-a8?t=74
dada
https://soundcloud.com/davidak-de/dada-ai-singing
what have you done??? i’m literally screaming. ha
https://soundcloud.com/davidak-de/disturbed-ai-is-literally-screaming
shit, fuck! ha
https://soundcloud.com/davidak-de/ai-with-tourette-syndrome-swearing
what’s going on
https://soundcloud.com/davidak-de/ai-total-breakdown
cyber space mastodon coqui tts cocktail noise linus G H A
well, every single letter produces 12 seconds noise
the problem seem to be that it does not know where the end is when there is no .
, !
or ?
at the end. it just continues speaking what comes to it’s mind…
you can try to generate the audio multiple times to get different results. sometimes it’s completely confused, but most times only the last word loops
when this issue occur, you see this in terminal:
> Model input: what's going on
> Text splitted to sentences.
["what's going on"]
| > Decoder stopped with 'max_decoder_steps
> Processing time: 7.120903015136719
> Real-time factor: 0.584658592060488
[INFO] ::ffff:127.0.0.1 - - [13/May/2021 20:45:27] "GET /api/tts?text=what%27s%20going%20on HTTP/1.1" 200 -
Related to:
- https://github.com/coqui-ai/TTS/discussions/278
- https://github.com/coqui-ai/TTS/discussions/325
- https://github.com/coqui-ai/TTS/wiki/FAQ#my-tacotron-model-does-not-stop---i-see-decoder-stopped-with-max_decoder_steps---stopnet-does-not-work
Expected behavior read the text like a human would
Environment (please complete the following information):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): NixOS GNU/Linux 21.05pre288911.65d6153aec8 (Okapi)
- PyTorch or TensorFlow version (use command below): which command???
- Python version: 3.8.9
- CUDA/cuDNN version:
- GPU model and memory:
- Exact command to reproduce:
- TTS version: 0.0.12
Additional context I hope it’s OK to make this joke that the AI has a mental disorder and it’s not too offensive.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
just updated the title to be more informative. Hope it is fine with you.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.