question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

wav2vec pretrain help

See original GitHub issue

❓ Questions and Help

i’m trying to pretrain custom wav2vec2 model using my own dataset. The dataset size is about 10k hour. The official wav2vec2 base model was used as parameter initialization. The training loss suddenly drop a lot after a few epoch training and validation loss become higher.

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

  1. Training loss (purple )doesn’t look right. Validation loss (red) become higher image

  2. does code perplexity curve look normal? image

  3. does gradient curve look normal? image

@alexeib can you kindly help? Thanks.

Code

use the same config as wav2vec2 base model.

What have you tried?

i tried lower the learning rate and fp32 training instead of fp16, but doesn’t help.

What’s your environment?

  • fairseq Version (e.g., 1.0 or master): master
  • PyTorch Version (e.g., 1.0)1.7.1
  • OS (e.g., Linux): Linnux
  • How you installed fairseq (pip, source): source
  • Build command you used (if compiling from source):pip install -e
  • Python version:3.7
  • CUDA/cuDNN version:11.0
  • GPU models and configuration: 4 V100
  • Any other relevant information:

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
alexeibcommented, Oct 30, 2021

extractor_mode: layer_norm is much more stable and typically has similar performance to default (just make sure you set feature_grad_mult to 1.0 and task.normalize=true)

layer_norm_first allows you to train beyond 500k updates without crashing in fp16 mode. by itself it is not as accurate as post layer norm, but when you train for longer you outperform post layer norm models. for this to be effective you you need to significantly increase the learning rate as compared to post layer norm model (by 20-30x)

0reactions
stale[bot]commented, Apr 18, 2022

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Self-training and pre-training, understanding the wav2vec ...
If a pre-trained model captures the structure of speech, then it should require few labeled examples to fine-tune it for speech recognition. The ......
Read more >
Wav2Vec2 - Hugging Face
The Wav2Vec2 model was proposed in wav2vec 2.0: A Framework for ... A notebook on how to leverage a pretrained Wav2Vec2 model for...
Read more >
fairseq/README.md at main - Wav2vec 2.0 - GitHub
wav2vec 2.0 learns speech representations on unlabeled data as described in wav2vec ... We also release multilingual pre-trained wav2vec 2.0 (XLSR) models: ...
Read more >
Wav2vec 2.0: Learning the structure of speech from raw audio
To address this issue, we explore the idea of cross-lingual training. The idea is to pretrain a single model on multiple languages at...
Read more >
Wav2vec could be more efficient, so we created our ... - ASAPP
So we created our own pre-trained ASR Model for better Conversational AI. By Felix Wu, PhD. Research Scientist at ASAPP.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found