question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Finetuning wav2vec2 with another letter vocabulary and `wer_args` setting problem.

See original GitHub issue

❓ Questions and Help

What is your question?

About Wav2Vec2 fine-tune

I want to use flashlight python bindings and kenlm to see my wer when finetuning on labeled data. The (readme)[https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md] told that add +criterion.wer_args='[/path/to/kenlm, /path/to/lexicon, 2, -1]' to the command line, but I don’t get the exact meaning. My kenlm is installed in home/pychen/kenlm, is it mean /path/to/kenlm should be set to home/pychen/kenlm? Here’s what my kenlm folder looks like:

~/kenlm$ ls
build  BUILDING  clean_query_only.sh  cmake  CMakeLists.txt  compile_query_only.sh  COPYING  COPYING.3  COPYING.LESSER.3  Doxyfile  LICENSE  lm  MANIFEST.in  python  README.md  setup.py  util

And what should /path/to/lexicon be set if I want use another letter vocabulary which is different from the (provided vocabulary)[https://dl.fbaipublicfiles.com/fairseq/wav2vec/dict.ltr.txt]? Here’s what my vocab looks like:

args["CHAR_TO_INDEX"] = {" ": 1, "'": 22, "1": 30, "0": 29, "3": 37, "2": 32, "5": 34, "4": 38, "7": 36, "6": 35, "9": 31, "8": 33, "A": 5, "C": 17,
                         "B": 20, "E": 2, "D": 12, "G": 16, "F": 19, "I": 6, "H": 9, "K": 24, "J": 25, "M": 18, "L": 11, "O": 4, "N": 7, "Q": 27,
                         "P": 21, "S": 8, "R": 10, "U": 13, "T": 3, "W": 15, "V": 23, "Y": 14, "X": 26, "Z": 28, "<EOS>": 39}

args["INDEX_TO_CHAR"] = {1: " ", 22: "'", 30: "1", 29: "0", 37: "3", 32: "2", 34: "5", 38: "4", 36: "7", 35: "6", 31: "9", 33: "8", 5: "A", 17: "C",
                         20: "B", 2: "E", 12: "D", 16: "G", 19: "F", 6: "I", 9: "H", 24: "K", 25: "J", 18: "M", 11: "L", 4: "O", 7: "N", 27: "Q",
                         21: "P", 8: "S", 10: "R", 13: "U", 3: "T", 15: "W", 23: "V", 14: "Y", 26: "X", 28: "Z", 39: "<EOS>"}

What’s your environment?

  • fairseq Version : master
  • PyTorch Version : 1.8.1
  • OS : Ubuntu
  • How you installed fairseq : source
  • Python version: 3.8.8
  • CUDA/cuDNN version: 11.2
  • GPU models and configuration: NVIDIA-A100
  • Any other relevant information: None

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11

github_iconTop GitHub Comments

2reactions
medabalimicommented, Jul 1, 2021

My bad. Mixed up the infer.py options with the training ones. +criterion.wer_kenlm_model=$LM_PATH +criterion.wer_lexicon=$LEX_PATH +criterion.wer_lm_weight=2 +criterion.wer_word_score=-1

1reaction
medabalimicommented, Jul 1, 2021

omegaconf.errors.ValidationError: Cannot convert ‘ListConfig’ to string : ‘[’/swa/lm/lm/lm.bin’, ‘/swa/lm/lm/lexicon.txt’, 2, -1]’

Nope. You need to use --lm-model /path/to/kenlm.bin --lexicon /path/to/lexicon --lm-weight 2 --word-score -1

Read more comments on GitHub >

github_iconTop Results From Across the Web

Finetuning wav2vec2 with another letter vocabulary ... - GitHub
And what should /path/to/lexicon be set if I want use another letter vocabulary which is different from the (provided vocabulary)[https://dl.
Read more >
Fine-Tune Wav2Vec2 for English ASR with Transformers
Wav2Vec2 is a pretrained model for Automatic Speech Recognition (ASR) ... and test data and build our vocabulary from this set of letters....
Read more >
Fine-tune and deploy a Wav2Vec2 model for speech ...
Wav2Vec2 is a transformer-based architecture for ASR tasks and was ... in sequence problems, and its output is a single letter or blank....
Read more >
Fine-tuning Wav2Vec2 with an LM head | TensorFlow Hub
The underlying task is to build a model for Automatic Speech Recognition i.e. given some speech, the model should be able to transcribe...
Read more >
Fine-tuning XLSR-Wav2Vec2 for WOLOF ASR with | Kaggle
Wav2Vec2 is a pretrained model for Automatic Speech Recognition (ASR) and was ... training and test data and build our vocabulary from this...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found