question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Wav2Vec2] Improve SpecAugment function by converting numpy based function to pytorch based function

See original GitHub issue

🚀 Feature request

As can be seen here: https://github.com/huggingface/transformers/blob/11655fafdd42eb56ad94e09ecd84d4dc2d1041ae/src/transformers/models/wav2vec2/modeling_wav2vec2.py#L47,

the function _compute_mask_indices (responsible for spec augment) of Wav2Vec2 is written in numpy which means that the function is not GPU compatible. The function could simply be rewritten in PyTorch, which should make training on GPU faster.

This “Good First Issue” is about converting _compute_mask_indices to PyTorch while keeping the same functionality.

Your contribution

I’m happy to guide the contributor through the PR!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:17 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
punitvaracommented, Mar 1, 2021

yes sure. Let me try. I will send PR tomorrow. Bit late to work on it for now.

1reaction
01-vyomcommented, May 3, 2021

@patrickvonplaten Made a PR.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Wav2Vec2 - Hugging Face
The Wav2Vec2 model was proposed in wav2vec 2.0: A Framework for Self-Supervised Learning of Speech ... This function makes use of Python's multiprocessing....
Read more >
Speech Recognition with Wav2Vec2 - PyTorch
First, we will create a Wav2Vec2 model that performs the feature extraction and the ... torchaudio.functional.resample() works on CUDA tensors as well.
Read more >
Wav2Vec 2.0: Self-Supervised Learning for ASR
Wav2Vec 2.0: state-of-the-art model for Automatic Speech Recognition. It takes advantage from a self-supervised training and contrastive learning.
Read more >
Fine-tuning XLSR-Wav2Vec2 for WOLOF ASR with | Kaggle
The main function was inspired by Remove Background/Dead Noise, ... a look at Common Voice and define a vocabulary based on the dataset's...
Read more >
Robust Speech Recognition via Large-Scale Weak ... - OpenAI
tion, it is often simple or rule-based and still detectable from other unhandled aspects such as ... function (Hendrycks & Gimpel, 2016) where...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found