Very high WER on test-other for CRDNN librispeech model
See original GitHub issueHi, So I was experimenting with the pretrained CRDNN model on the English librispeech data. While the performance is quite good on test-clean set, the performance on the noisy test-other set is very bad.
Looking at WER stats from the pretrained model on gdrive:
test-clean
%WER 3.09 [ 1622 / 52576, 167 ins, 171 del, 1284 sub ]
%SER 33.66 [ 882 / 2620 ]
Scored 2620 sentences, 0 not present in hyp.
================================================================================
ALIGNMENTS
Format:
<utterance-id>, WER DETAILS
<eps> ; reference ; on ; the ; first ; line
I ; S ; = ; = ; S ; D
and ; hypothesis ; on ; the ; third ; <eps>
================================================================================
672-122797-0033, %WER 0.00 [ 0 / 2, 0 ins, 0 del, 0 sub ]
A ; STORY
= ; =
A ; STORY
================================================================================
2094-142345-0041, %WER 0.00 [ 0 / 1, 0 ins, 0 del, 0 sub ]
DIRECTION
=
DIRECTION
================================================================================
2830-3980-0026, %WER 50.00 [ 1 / 2, 0 ins, 0 del, 1 sub ]
VERSE ; TWO
S ; =
FIRST ; TWO
================================================================================
237-134500-0025, %WER 50.00 [ 1 / 2, 0 ins, 0 del, 1 sub ]
OH ; EMIL
= ; S
OH ; AMIEL
(cut for brevity)
test-other
%WER 219.23 [ 114 / 52, 64 ins, 0 del, 50 sub ]
%SER 100.00 [ 16 / 16 ]
Scored 16 sentences, 0 not present in hyp.
2414-128291-0020, %WER 700.00 [ 7 / 1, 6 ins, 0 del, 1 sub ]
WELL ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>
S ; I ; I ; I ; I ; I ; I
I ; DON'T ; KNOW ; WHAT ; YOU'RE ; TALKING ; ABOUT
================================================================================
7902-96592-0020, %WER 700.00 [ 7 / 1, 6 ins, 0 del, 1 sub ]
NONSENSE ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>
S ; I ; I ; I ; I ; I ; I
I ; DON'T ; KNOW ; WHAT ; YOU'RE ; TALKING ; ABOUT
================================================================================
8188-269290-0057, %WER 125.00 [ 5 / 4, 2 ins, 0 del, 3 sub ]
<eps> ; <eps> ; I ; WILL ; TELL ; HER
I ; I ; = ; S ; S ; S
I'M ; AFRAID ; I ; CAN'T ; AFFORD ; IT
================================================================================
3538-142836-0023, %WER 600.00 [ 6 / 1, 5 ins, 0 del, 1 sub ]
ICES ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>
S ; I ; I ; I ; I ; I
I'M ; AFRAID ; I ; CAN'T ; AFFORD ; IT
(cut for brevity)
Why the performance is so bad on test-other
set? What I also noticed is that for any other audio sets outside of test-clean
performance is also really bad. For example, I’ve tried it for SLR70 and the average WER is also way above 100.
Issue Analytics
- State:
- Created a year ago
- Comments:10
Top Results From Across the Web
speechbrain/asr-crdnn-transformerlm-librispeech
CRDNN with CTC/Attention and RNNLM trained on LibriSpeech ... Release, Test clean WER, Test other WER, GPUs ... Acoustic model (CRDNN + CTC/Attention)....
Read more >LibriSpeech test-other Benchmark (Speech Recognition)
Rank Model Word Error Rate (WER) Year
1 w2v‑BERT XXL 2.5 2021
3 HuBERT with Libri‑Light 2.9 2021
4 Conv + Transformer + wav2vec2.0 + pseudo...
Read more >arXiv:2110.08583v1 [eess.AS] 16 Oct 2021
model pretrained without any label can reach a WER as low as 8.2 on Librispeech test-other and only 10h are necessary to go...
Read more >LibriSpeech ASR corpus - openslr.org
Summary: Large-scale (1000 hours) corpus of read English speech. Category: Speech. License: CC BY 4.0. Downloads (use a mirror closer to you):
Read more >Robust Speech Recognition via Large-Scale Weak Supervision
the smallest zero-shot Whisper model, which has only 39 million parameters and a 6.7 WER on LibriSpeech test-clean is roughly competitive with the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Ok, I can’t today because a problem occurred on our lab cluster ahah
Thanks for adding the new model! Just saw that it has been updated (https://drive.google.com/drive/folders/19mAyMR1ITSb83Anhds4n694PLwKD47yf?usp=sharing). It looks good now. I will test it and if there will be some problems I will reopen the issue.