๐ธ TTS roadmap
See original GitHub issueThese are the main dev plans for ๐ธ TTS.
If you want to contribute to ๐ธ TTS and donโt know where to start you can pick one here and start with our Contribution Guideline. Weโre also always here to help.
Feel free to pick one or suggest a new one.
Contributions are always welcome ๐ช .
v0.1.0 Milestones
- Better model config handling #21
- TTS recipes for public datasets.
- TTS trainer API to unify all the model training scripts.
- TTS, Vocoder and SpeakerEncoder model abstractions and APIs.
- Documentation for
- Implementing a new model using ๐ธ TTS.
- Training a model on a new dataset from gecko.
- Using
Synthesizer
interface onCLI
orServer
. - Extracting Spectrograms for Vocoder training.
- Contributing a new pre-trained ๐ธ TTS model.
- Explanation for Model config parameters/
v0.2.0 Milestones
- Grapheme 2 Phoneme in-house conversion. (Thx to gruut ๐ )
- Implement VITS model.
v0.3.0 Milestones
- Implement generic ForwardTTS API.
- Implement Fast Speech model.
- Implement Fast Pitch model.
v0.4.0 Milestones
- Trainer API v2 - join the discussion
- Multi-speaker VCTK recipes for all the
TTS.tts
models.
v0.5.0 Milestones
- Support for multi-lingual models
- YourTTS release ๐
v0.6.0 Milestones
- Add ESpeak support
- New Tokenizer and Phonemizer APIs #937
- New Model API #1078
- Splitting the trainer as a separate repo ๐Trainer
- Update VITS model API
- Gradient accumulation. #560 (in ๐)
v0.7.0 Milestones
- Implement Capacitron ๐ @a-froghyar ๐ @WeberJulian
- Release pretrained Capacitron
v0.8.0 Milestones
- Separate numpy transforms
- Better data sampling for VITS
- New Thorsten DE models ๐ @thorstenMueller
๐โโ๏ธ Milestones along the way
- Implement End-to-end training API for ForwardTTS models a vocoder. #1510
- Implement a Python voice synthesis API.
- Inject phonemes to the input text at inference. #1452
- AdaSpeech1/2 https://arxiv.org/pdf/2104.09715 and https://arxiv.org/abs/2103.00993
- Let the user pass a custom text cleaner function.
- Refactor the text cleaners for a more flexible and transparent API.
- Implement HifiGAN2 (not the vocoder)
- Implement emotion and style adaptation.
- Implement FastSpeech2 (https://arxiv.org/abs/2006.04558).
- AutoTTS ๐ค (๐ @loganhart420)
- Watermarking TTS outputs to sign against DeepFakes.
- Implement SSML v0.0.1
- ONNX and TorchScript model exports.
- TensorFlow run-time for training models.
๐ค New TTS models
- AlignTTS (@erogol)
- HiFiGAN (#16 ๐ @rishikksh20 and @erogol)
- UnivNet Vocoder ( ๐ @rishikksh20)
- VITS paper
- FastPitch source
- Alignment Network paper
- End2End TTS combining aligner + tts + vocoder.
- Multi-Lingual TTS (#11 ๐ @WeberJulian )
- ParallelTacotron paper (open for contribution)
- Efficient TTS paper (open for contribution)
- Gaussian length regulator from https://arxiv.org/pdf/2010.04301.pdf (open for contribution)
- LightSpeech from https://arxiv.org/pdf/2102.04040.pdf (open for contribution)
- AdaSpeech1/2 https://arxiv.org/pdf/2104.09715 and https://arxiv.org/abs/2103.00993
Issue Analytics
- State:
- Created 3 years ago
- Reactions:52
- Comments:42 (17 by maintainers)
Top Results From Across the Web
TTS 0.10.0 documentation
๐ธTTS is a library for advanced Text-to-Speech generation. ... ReadTheDocs. ๐พ Installation. TTS/README.md. ๐ฉโ๐ป Contributing. CONTRIBUTING.md. ๐ Road Map.
Read more >TTS - PyPI
๐ธTTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve ... Road Map, Main Development...
Read more >A Deep Learning toolkit for Text-to-Speech, Battle-tested in ...
๐ธTTS is a library for advanced Text-to-Speech generation. It's built on the latest research, ... Road Map, Main Development Plans. ๐ Released Models,...
Read more >TTS | Read the Docs
Description. ๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production. Repository. https://github.com/coqui-ai/TTS.gitย ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Iโm learning the code/API and performing experiments. I hope to contribute soon.
Iโm also wondering if I can donate (money) to Coqui?
Hello, thanks for great works! Iโm a fan of
Coqui TTS
.Iโm porting some of the stuffs in the project to the
Rust
for the following reasons.The
VC
in theYourTTS
has been successfully implemented. And for this purpose, an example of saving/loading a pretrainedVits
model has been added in the repo. I write it on Milestones PR because I think my work can be helpful to others ๐