Really slow Wav2Vec 2 pretraining
See original GitHub issueWhat is your question?
I’m trying to pretrain wav2vec2 base model on my own dataset and it is really slow. I want to speed it up. My dataset contains about 100 hours of speech. It’s stored in data directory. All of them are single channel 16khz wav files. This
python examples/wav2vec/wav2vec_manifest.py /path/to/waves --dest /manifest/path --ext $ext --valid-percent $valid
runs fine and gives correct manifest. I expected reasonable slow down compared to original set up of 64 Tesla V100 but not thousand times slower. Currently one epoch takes 5 minutes and if I want to do 400000 updates it’d take almost 4 years!
Code
I don’t code anything but here’s comand i’m using for training.
fairseq-train --distributed-world-size 6 ./manifest/ --save-dir ./res --fp16 --num-workers 16 --task audio_pretraining
--criterion wav2vec --arch wav2vec2 --log-keys '["prob_perplexity","code_perplexity","temp"]'
--quantize-targets --extractor-mode default --conv-feature-layers '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2'
--final-dim 256 --latent-vars 320 --latent-groups 2 --latent-temp '(2,0.5,0.999995)' --infonce --optimizer adam
--adam-betas '(0.9,0.98)' --adam-eps 1e-06 --lr-scheduler polynomial_decay --total-num-update 10000 --lr 0.0005
--warmup-updates 800 --mask-length 10 --mask-prob 0.65 --mask-selection static --mask-other 0
--encoder-layerdrop 0.05 --dropout-input 0.1 --dropout-features 0.1 --feature-grad-mult 0.1
--loss-weights '[0.1, 10]' --conv-pos 128 --conv-pos-groups 16 --num-negatives 100
--cross-sample-negatives 0 --max-sample-size 250000 --min-sample-size 32000 --dropout 0.1
--attention-dropout 0.1 --weight-decay 0.01 --max-tokens 1400000 --max-update 10000
--skip-invalid-size-inputs-valid-test --ddp-backend no_c10d --update-freq 64/6 --save-interval 5000
What have you tried?
I tried installing apex as it should speed up fairseq. It gave a 2x boost but it’s still years to go.
What’s your environment?
- fairseq Version (e.g., 1.0 or master): 0.10.1
- PyTorch Version (e.g., 1.0) 1.8.0a0+1606899
- OS (e.g., Linux): Ubuntu 20.04.1 LTS (Focal Fossa)
- How you installed fairseq (
pip
, source): pip install fairseq==0.10.1 - Build command you used (if compiling from source): None
- Python version: Python 3.8.5
- CUDA/cuDNN version: CUDA 11.1.105
- GPU models and configuration: six RTX 3090
- Any other relevant information: I also tested on VM with two rtx 2080 ti
Issue Analytics
- State:
- Created 3 years ago
- Comments:8
Top Results From Across the Web
Really slow Wav2Vec 2 pretraining · Issue #3114 - GitHub
I'm trying to pretrain wav2vec2 base model on my own dataset and it is really slow. I want to speed it up. My...
Read more >Anybody got good with wav2vec 2.0 yet? - Google Groups
If anyone got good results, what I want to know is: what are acceptable values for loss and validation accuracy in audio pre-training?...
Read more >Why is Wav2Vec pretraining loss not decreasing? - Models
During the pre-training phase, The loss starts off around 4, decreases and then shoots up to 6.658 and stays there. The accuracy is...
Read more >Wav2Vec2.0 on the Edge: Performance Evaluation | DeepAI
0 is a state-of-the-art model which learns speech representations through unlabeled speech data, aka, self supervised learning. The pretrained ...
Read more >Compressing Wav2vec 2.0 - Medium
Model compression entails compressing a machine learning model into a smaller model without losing too much performance. Suppose we have a ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yep @marma @apoca909 I got it but it is still really slow process. I expermiented a lot with different setups an yet to archive training speed that was indicated in original paper.
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!