question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bad performance on real audio

See original GitHub issue

Hi @leo19941227

I was able to get the code run and do some exploration on ASR. However, the performance is too bad. Now I’m trying to figure out what could be the problem. I think one could be the configuaration I used. I really appreciate it if you could have a look at this and share with me any suggestion/hint.

python run_downstream.py -m=inference  -c='./downstream/asr/config.yaml' -d=asr -t='mani.flac' -p='result/wav2vec2_hug_base_960_final' -n='result/downstream/asr_wav2vec2_hug_base_960' -i='result/downstream/wav2vec2_hug_base_960_final/dev-clean-best.ckpt' -e='result/downstream/wav2vec2_hug_base_960_final/dev-clean-best.ckpt' -u='result/downstream/wav2vec2_hug_base_960' -s=hidden_states

And this is the text:

No, the cheaper option would be great though.

No problem. How about a flight that leaves from Seattle to Paris on May 15 at 3 PM. Your next flight will be on May 19 at 10 PM to London. 

And this is the transcription from model:

NO CHIEF WOULD BE GREAT 

NO PRO HOW BUT A FLIGHT THAT LEAVES FROM SALO TO PARIS ON MAY FIFTEEN THREE YOUR NEXT FLIGHT WILL BE ON MAY NINETEEN AT TEN P TO LONDON 

Thanks for your help~

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:11 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
Mecoli1219commented, Nov 12, 2022

Any hint or idea on this?

Hello, @benam2 ! Recently I have done the similar experiment and found the same issue as yours. I thought the problem might be the default config.yml of asr didn’t use the Language model in decoding, so the generated sequence will have some wierd words. BTW, I have infered the model on my real voice (which is a poor English speaker), so the effect of not using LM is more obvious.

NO THAT SHEEP HER UPSUN WILL BE QUATE E DOGH
NO PROBIN HOWBOT OF FLIGHT TLAT LEAVES FROM SITOES TO PARIS ON MAIDE FIFTEEN AT THREE P M YOU R NEST FRIGHT WILL BE A MAY NIGHTING AT TEMP IM TO LONDON

I’m not sure is your problem same as mine. You can try to set the decoder_type in config.yaml to "kenlm" and try the experiment again. I hope that this is going to help you.

1reaction
Mecoli1219commented, Nov 18, 2022

@benam2 well, then it may not be the LM problem. I have come up with three ideas:

  1. The quality of the sound.(Noise, amplitude, …)
  2. Could the non-native speaker be the main reason? Refer to Accent modification for speech recognition of non-native speakers using neural style transfer
  3. You can try the task on different upstream model. (Some said that Hubert is better?)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Bad audio affects work performance. Here is what you can do ...
Poor audio quality is a similarly vexing problem that makes virtual meetings a frustrating experience. In fact, according to Jesper Kock, vice ...
Read more >
Realtek HD Audio Low and Bad Quality Sound After Windows ...
When your audio sounds terrible after upgrading your PC – the sound is distorted, the bass is lacking, there's screeching and chirping– it's...
Read more >
High On Life has some pretty bad performance issues on Day 1
This game has a ton of potential, but the performance and audio issues are really killing what would otherwise be a very enjoyable...
Read more >
How can I tune a Windows PC for best audio performance?
This can have a detrimental effect on your computer's audio performance. ... However, due to Murphy's law ('anything that can go wrong will...
Read more >
Terrible performance - Csound Noobs - Cabbage Audio Forum
I am playing testing my synth and I am witnessing extremely bad performance, 2 oscilator with 3 second release time , sweep through...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found