question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Really slow Wav2Vec 2 pretraining

See original GitHub issue

What is your question?

I’m trying to pretrain wav2vec2 base model on my own dataset and it is really slow. I want to speed it up. My dataset contains about 100 hours of speech. It’s stored in data directory. All of them are single channel 16khz wav files. This

python examples/wav2vec/wav2vec_manifest.py /path/to/waves --dest /manifest/path --ext $ext --valid-percent $valid

runs fine and gives correct manifest. I expected reasonable slow down compared to original set up of 64 Tesla V100 but not thousand times slower. Currently one epoch takes 5 minutes and if I want to do 400000 updates it’d take almost 4 years!

Code

I don’t code anything but here’s comand i’m using for training.

fairseq-train --distributed-world-size 6 ./manifest/ --save-dir ./res --fp16 --num-workers 16 --task audio_pretraining 
--criterion wav2vec --arch wav2vec2 --log-keys '["prob_perplexity","code_perplexity","temp"]'
--quantize-targets --extractor-mode default --conv-feature-layers '[(512, 10, 5)] + [(512, 3, 2)] * 4 + [(512,2,2)] * 2' 
--final-dim 256 --latent-vars 320 --latent-groups 2 --latent-temp '(2,0.5,0.999995)' --infonce --optimizer adam 
--adam-betas '(0.9,0.98)' --adam-eps 1e-06 --lr-scheduler polynomial_decay --total-num-update 10000 --lr 0.0005 
--warmup-updates 800 --mask-length 10 --mask-prob 0.65 --mask-selection static --mask-other 0 
--encoder-layerdrop 0.05 --dropout-input 0.1 --dropout-features 0.1 --feature-grad-mult 0.1 
--loss-weights '[0.1, 10]' --conv-pos 128 --conv-pos-groups 16 --num-negatives 100 
--cross-sample-negatives 0 --max-sample-size 250000 --min-sample-size 32000 --dropout 0.1 
--attention-dropout 0.1 --weight-decay 0.01 --max-tokens 1400000 --max-update 10000 
--skip-invalid-size-inputs-valid-test --ddp-backend no_c10d --update-freq 64/6 --save-interval 5000

What have you tried?

I tried installing apex as it should speed up fairseq. It gave a 2x boost but it’s still years to go.

What’s your environment?

  • fairseq Version (e.g., 1.0 or master): 0.10.1
  • PyTorch Version (e.g., 1.0) 1.8.0a0+1606899
  • OS (e.g., Linux): Ubuntu 20.04.1 LTS (Focal Fossa)
  • How you installed fairseq (pip, source): pip install fairseq==0.10.1
  • Build command you used (if compiling from source): None
  • Python version: Python 3.8.5
  • CUDA/cuDNN version: CUDA 11.1.105
  • GPU models and configuration: six RTX 3090
  • Any other relevant information: I also tested on VM with two rtx 2080 ti

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
jubick1337commented, Mar 19, 2021

Yep @marma @apoca909 I got it but it is still really slow process. I expermiented a lot with different setups an yet to archive training speed that was indicated in original paper.

0reactions
stale[bot]commented, Apr 18, 2022

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Really slow Wav2Vec 2 pretraining · Issue #3114 - GitHub
I'm trying to pretrain wav2vec2 base model on my own dataset and it is really slow. I want to speed it up. My...
Read more >
Anybody got good with wav2vec 2.0 yet? - Google Groups
If anyone got good results, what I want to know is: what are acceptable values for loss and validation accuracy in audio pre-training?...
Read more >
Why is Wav2Vec pretraining loss not decreasing? - Models
During the pre-training phase, The loss starts off around 4, decreases and then shoots up to 6.658 and stays there. The accuracy is...
Read more >
Wav2Vec2.0 on the Edge: Performance Evaluation | DeepAI
0 is a state-of-the-art model which learns speech representations through unlabeled speech data, aka, self supervised learning. The pretrained ...
Read more >
Compressing Wav2vec 2.0 - Medium
Model compression entails compressing a machine learning model into a smaller model without losing too much performance. Suppose we have a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found