question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect use of torchaudio's rnnt_loss

See original GitHub issue

It is suprising that no one has noticed that the current code is feeding the output of log_softmax to torchaudio’s rnnt_loss, which is not correct.

torchaudio’s rnnt_loss accepts only logits as input.

I am wondering whether anyone has succeeded in training a transducer model with the current usage of torchaudio’s rnnt_loss using speechbrain.


The relevant code is listed below:

https://github.com/speechbrain/speechbrain/blob/24720c1532355dfeeda78304939dcc80283244bf/recipes/CommonVoice/ASR/transducer/train.py#L75

https://github.com/speechbrain/speechbrain/blob/24720c1532355dfeeda78304939dcc80283244bf/recipes/CommonVoice/ASR/transducer/train.py#L136-L138

https://github.com/speechbrain/speechbrain/blob/24720c1532355dfeeda78304939dcc80283244bf/recipes/CommonVoice/ASR/transducer/hparams/train_fr.yaml#L183-L185

https://github.com/speechbrain/speechbrain/blob/24720c1532355dfeeda78304939dcc80283244bf/speechbrain/nnet/losses.py#L59-L77

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:6

github_iconTop GitHub Comments

1reaction
vsokolov00commented, Mar 29, 2022

It’s strange but I didn’t notice any improvements during the training process, after fixing this bug it produces even worse results during the validation… I’ll make the update once I decode it on the test set

0reactions
Adel-Moumencommented, Oct 28, 2022

Hello,

This issue has been solved with #1368, so I’m closing this issue. Thanks again for reporting this problem! 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

torchaudio.transformes.RNNTLoss · Issue #2196 · pytorch/audio
Describe the bug. If I don't use the max seq length for logit length, it will cause run time error, input length mismatch....
Read more >
torchaudio.functional.rnnt_loss - PyTorch
The RNN Transducer loss extends the CTC loss by defining a distribution over output sequences of all lengths, and by jointly modelling both...
Read more >
Somshubra Majumdar on Twitter: "All this to say, now we have ...
Recently Torchaudio 0.10 added RNNT loss too, so now its a lot more available to users. The days of being limited by the...
Read more >
torchaudio Changelog - pyup.io
Removed invalid token blanking logic from RNN-T decoder (2180) ... Moved TorchAudio conda package to use pytorch-mutex (1904)
Read more >
Speech Command Recognition with torchaudio
We use torchaudio to download and represent the dataset. ... 1 pin_memory = True else: num_workers = 0 pin_memory = False train_loader =...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found