Finetune Hubert model : Adding new vocabulary
See original GitHub issueEnvironment info
transformers version: 4.12.2
Platform: Mac
Python version: 3.7
PyTorch version (GPU?): 1.9
Tensorflow version (GPU?): No
Using GPU in script?: No
Using distributed or parallel setup in script?: No
I just run simple code to load Hubert pretrained base model
from transformers import Wav2Vec2Processor, HubertForCTC
import torch
import librosa
PROCESSOR = Wav2Vec2Processor.from_pretrained('facebook/hubert-large-ls960-ft')
model = HubertForCTC.from_pretrained('facebook/hubert-large-ls960-ft')
tokenizer = PROCESSOR.tokenizer
On a smaller dataset, i am able to get good WER around 0.0
But if I add new tokens/vocabulary to it by using the below code:
from transformers import Wav2Vec2Processor, HubertForCTC
import torch
import librosa
PROCESSOR = Wav2Vec2Processor.from_pretrained('facebook/hubert-large-ls960-ft')
model = HubertForCTC.from_pretrained('facebook/hubert-large-ls960-ft')
tokenizer = PROCESSOR.tokenizer
tokenizer.add_tokens(new_tokens=[' ','Ä','Ö','Ü'])
The loss and WER go bad and then worse (clearly), and later loss is NAN.
Is it the correct way to add new alphabets?
dataset is same in both trainings
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Hubert - Hugging Face
Hubert was proposed in HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units by Wei-Ning Hsu, Benjamin Bolte, Yao-Hung ...
Read more >Add training recipes for HuBERT model pre-training and ASR ...
To fine-tune the HuBERT model for customized down-stream task, people need to install and adopt their training pipeline to fairseq. It will be ......
Read more >Fine Tune Pre-trained BERT model on new dataset(and vocab)
Yes, you can add the token to Bert's vocab. but it is not recommended way. because BERT uses a word-piece-based vocabulary, so it...
Read more >Detect emotion in speech data: Fine-tuning HuBERT using ...
I have already covered how to create this script (in excruciating detail) in a ... Since we will be using the facebook/hubert-base-ls960 as...
Read more >HuBERT: Self-Supervised Speech Representation Learning ...
Since we expect a pre-trained model to provide better representations than the raw acoustic feature such as MFCCs, we can create a new...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Cool, it worked. Thank you
Hey @harrypotter90,
exactly sorry I forgot to mention this parameter. To summarize, I would recommend to add new tokens and load your model as follows: