Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] Glow-TTS, Trained in multi GPU get KeyError: 'avg_loss'

See original GitHub issue

Hi! When I follow to recipes train a glow-tts, I get this error

 ! Run is kept in /workspace/tts/glow_tts/glow_tts_chinese-September-20-2021_02+44PM-0000000
Traceback (most recent call last):
  File "/workspace/TTS/TTS/trainer.py", line 919, in fit
    self._fit()
  File "/workspace/TTS/TTS/trainer.py", line 904, in _fit
    self.train_epoch()
  File "/workspace/TTS/TTS/trainer.py", line 738, in train_epoch
    _, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
  File "/workspace/TTS/TTS/trainer.py", line 685, in train_step
    target_avg_loss = self._pick_target_avg_loss(self.keep_avg_train)
  File "/workspace/TTS/TTS/trainer.py", line 957, in _pick_target_avg_loss
    target_avg_loss = keep_avg_target["avg_loss"]
  File "/workspace/TTS/TTS/utils/generic_utils.py", line 155, in __getitem__
    return self.avg_values[key]
KeyError: 'avg_loss'

and current_lr always 0.00000