question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] VITS LJSpeech recipe no improvement in 70k steps (batch size 16)

See original GitHub issue

Describe the bug

I’ve used the VITS LJSpeech recipe for 70k steps (batch size 16) and have seen no drops in loss, the alignment is always perfect, and the audio always sounds the same.

Loss:

image

Alignment at 12k:

image

Alignment at 70k:

image

Audio at 12k:

https://user-images.githubusercontent.com/88913682/142049725-c539f76b-2e10-41ef-a6e8-c29d0798738a.mp4

Audio at 70k:

https://user-images.githubusercontent.com/88913682/142049732-801815eb-886f-4b30-bb85-810fd471c6b9.mp4

To Reproduce

Apply the fix described in Additional context below.

Copy recipes/ljspeech/vits_tts/train_vits.py to runs2/train_vits.py and make the following changes:

  • Fix ljspeech path
  • batch_size=16
  • eval_batch_size=8

Run it with CUDA_VISIBLE_DEVICES=1 python runs2/train_vits.py.

Expected behavior

I expect that:

  • Loss drops over time.
  • Alignment starts out blurry and develops a line over time.
  • The audio changes over time.

Environment (please complete the following information):

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • PyTorch or TensorFlow version (use command below): PyTorch 1.10
  • Python version: 3.7
  • CUDA/cuDNN version: 11.3 / 8200
  • GPU model and memory: RTX 3080 10GB
  • Exact command to reproduce: CUDA_VISIBLE_DEVICES=1 python runs2/train_vits.py

Additional context

When I originally ran it I ran into this bug: https://github.com/NVIDIA/apex/issues/694

And I applied this fix: https://github.com/NVIDIA/apex/issues/694#issuecomment-918833904


EDIT

Be careful applying the fix I mentioned; I think it’s the reason that training was broken for me.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
skol101commented, Nov 23, 2021

Looks like apex is supported. Too bad apex package from conda doesn’t run on >=python3.9. With mixed_precision: False suddenly 90% of 3090 memory is used when running batch 32, instead of 65-70% utilisation with mixed_precision: True.

1reaction
khusainovaidarcommented, Nov 21, 2021

Had more or less the same issue, ‘solved’ by turning mixed_precision to False in config.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Multiple independent models, only one requires apex.amp ...
fijipants mentioned this issue on Nov 16, 2021. [Bug] VITS LJSpeech recipe no improvement in 70k steps (batch size 16) coqui-ai/TTS#938.
Read more >
VISinger: Variational Inference with Adversarial Learning for ...
In this paper, we build upon VITS and propose VISinger, an end-to-end singing voice synthesis system based on variational inference (VI). To the ......
Read more >
0.xml - Kaggle
... 2016-07-16T03:50:21.863Z https://www.kaggle.com/datasets/uciml/electric-power-consumption-data-set 2016-08-23T17:02:15.51Z ...
Read more >
XXI congresso nazionale. - Morlacchi Editore
does not exacerbate existing IP system inequalities. We therefore propose steps that industry, civil society, and policymakers can take to ...
Read more >
Speech synthesis issues and how to fix | GitAnswer
Use ESPnet as a library, the acc doesn't improve. ... TTS vITS LJSpeech recipe no improvement in 70k steps (batch size 16) ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found