spm_train and spm_encode: command not found
See original GitHub issueMerge all train packs into one Merge train text dictionary: data/lang_char/train_100_unigram5000_units.txt Dictionary preparation ./examples/speech_recognition/datasets/prepare-librispeech.sh: line 71: spm_train: command not found ./examples/speech_recognition/datasets/prepare-librispeech.sh: line 72: spm_encode: command not found 3 data/lang_char/train_100_unigram5000_units.txt Prepare train and test jsons usage: asr_prep_json.py [-h] --audio-dirs AUDIO_DIRS [AUDIO_DIRS …] --labels LABELS --spm-model SPM_MODEL --dictionary DICTIONARY [–audio-format {flac,wav}] --output OUTPUT
Hello everyone, I install fairseq by source files, when I run
./examples/speech_recognition/datasets/prepare-librispeech.sh /home/yfchen/zhu/data/librispeech/ /home/yfchen/zhu/data/process_libri
, it show above error. How can I fix this problem, thank you in anvance.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (1 by maintainers)
replace “spm_train” to “python ${fairseq_root}/scripts/spm_train.py” and replace “spm_encode” to “python ${fairseq_root}/scripts/spm_encode.py” can solve this problem.
@SeunghyunSEO LoL!! I changed the code above a little and organized it here.
https://github.com/sooftware/KoSpeech/tree/master/dataset/libri
The main difference is that sentencepiece is used in Python.