Errors when trying to train SC-GlowTTS
See original GitHub issueDescribe the bug
I am trying to train SC-GlowTTS model. I downloaded the config from the latest release and tried to launch TTS/bin/train_glow_tts.py
. However, I face different errors regarding the missing values in the config. First it was stats_path
, then use_noise_augment
and now I get AssertionError: 22050 vs 48000
, despite the fact that configs state “wav sample-rate. If different than the original data, it is resampled”. What is the proper way to train SC-GlowTTS? 😃
To Reproduce Steps to reproduce the behavior:
- Download and unzip SC-GlowTTS config from v0.0.13 release (https://github.com/coqui-ai/TTS/releases/download/v0.0.12/tts_models--en--vctk--sc-glowtts-transformer.zip)
- Download and unzip VCTK dataset e. g. from here (link from SC-GlowTTS repo)
- Substitute dataset path in config for yours
- Download and install glow TTS:
git clone https://github.com/coqui-ai/TTS && cd TTS && pip install -e .
- Execute with your config path from TTS directory:
python TTS/bin/train_glow_tts.py --config_path /path/to/config/
Expected behavior The model trains without errors
Environment (please complete the following information):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
- PyTorch or TensorFlow version (use command below): Pytorch 1.8.1
- Python version: Python 3.7.10
- CUDA/cuDNN version: CUDA 10.2 cuDNN 7.6.5
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (6 by maintainers)
Top Results From Across the Web
Error when trying to run train · Issue #618 · MIC-DKFZ/nnUNet
When trying to train a 3d_fullres model with nnUNet_train 3d_fullres I get the following error: nerV2.py", line 431, in run_training ret ...
Read more >Train error vs Test error — scikit-learn 1.2.0 documentation
Illustration of how the performance of an estimator on unseen data (test data) is not the same as the performance on training data....
Read more >Error when trying to train a deep learning model
I've got Image Analyst enabled. I marked the waterbodies to 'label objects for deep learning'. Next I successfully exported training samples.
Read more >Errors when attempting to train on GPU - Stack Overflow
It sounds like the model contains an operation for which GPU-support is not enabled, so letting soft_placement handles it is probably the ...
Read more >Positive Parroting | Outside My Window
I want to be a resource.” ... Cathy Schlott teaches how to train your bird in a positive way, reward good behavior and...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@loganhart420 I recommend that you use the same speaker encoder used in the paper and available here (trained by 330k steps).
In SC-GlowTTS the quality of the speaker encoder is fundamental because it doesn’t receive any extra information from the speaker.
As your batch size is smaller, you should train more. In addition, in the article, we trained the model by 150k steps using the VCTK, which is much smaller and has only 108 speakers. So as you are training in a larger dataset, you need to train more steps.
Maybe @Edresson can help as the one who trained the models.
My take is that LibriTTS is a harder dataset and more difficult to reach the same quality.