Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add error message to Wav2Vec2 & Hubert if labels > vocab_size

See original GitHub issue

🚀 Feature request

Add better error message to HubertForCTC, Wav2Vec2ForCTC if labels are bigger than vocab size.

Motivation

Following this issue: https://github.com/huggingface/transformers/issues/12264 it is clear that an error message should be thrown if any of the any of the labels are > self.config.vocab_size or else silent errors can sneak into the training script.

So we should modify: Wav2Vec2ForCTC, TFWav2Vec2ForCTC, and HubertForCTC to add a nice error message in this case.

Your contribution

This is a first good issue and should be rather easy to accomplish. I’m happy to give more guidance if needed.

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

VirajBagalcommented, Sep 18, 2021

Thanks for informing. I had seen it, but since the issue is still open, I thought something might be left.

1reaction

thevasudevguptacommented, Jun 20, 2021

I will create a PR to fix this.

Top Results From Across the Web

Wav2Vec2 - Hugging Face

Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data still achieves 4.8/8.2 WER. This demonstrates the feasibility...

HuBERT: How to Apply BERT to Speech, Visually Explained

Illustrated architecture and training process of HuBERT, a model for self-supervised speech representation.

WavLM: Large-Scale Self-Supervised Pre-Training for Full ...

show that speech separation models trained on top of HuBERT ... method [24], our model achieves a 12.6% diarization error rate reduction.

A Self-supervised Model for Speech Representation Learning

The proposed self-supervised model is trained on 10k hours of unlabeled data ... of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on...

share - DeepAI

A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion ... that the labelling of data is fairly a time and money costing process.