Losses are not correct
See original GitHub issueThere is a problem with calculate_3d_loss
in tensorflow_tts.utils.strategy
.
The mel ground truth (y_gt
) should not be truncated if the mel prediction (y_pred
) is shorter. If the prediction is shorter, it should be penalized. One way to do this is to pad the prediction to the ground truth length.
In practice, this rarely happens, because stop_token_loss
is wrongly set up and usually causes the model produce output longer than the ground truth. This is also due to truncated y_pred
in calculate_2d_loss
. Consider the following, where max_mel_length = 3
:
stop_token_predictions = [-20, -20, -20, -20, -20, -20, 5, 5, 5, 5]
stop_gts = [0, 0, 0]
Truncating stop_token_predictions
will make loss close to 0, although the stop token prediction is totally wrong (it should stop after 6 mel slices, not 3). To make it right, stop_gts
should be padded with 1s.
There also needs to be masking in the loss functions, and the training can be a lot faster if we use bucket_by_sequence_length
for batching the dataset. I’m currently implementing these.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
@iamanigeeit hi, tacotron2 use teacher forcing so the len of y_gt and y_pred are also equal 😄
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.