Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RoBERTa model problem

See original GitHub issue

Hi,

I was trying to fine tune the roberta model with my own task with the implementation from this fantastic repo. However I do encounter one significant problem.

I trained the model on Google Colab and saved it with torch.save(model.state_dict(), f"/content/drive/My Drive/roberta/models/state_fold{i}") and then load the model with model.load_state_dict(torch.load(path, map_location='cpu')) on my local machine, where the method extract_features would just return the same output regardless the input. I have been using a workaround by fix all the parameters of roberta when training and reload the roberta with self.roberta = torch.hub.load('pytorch/fairseq', 'roberta.base') after I load the state_dict, which fixed the issue but still kind of not satisfying as I cannot finetune the model but only the classification heads.

Issue Analytics

State:
Created 4 years ago
Comments:13 (7 by maintainers)

Top GitHub Comments

1reaction

myleottcommented, Aug 19, 2019

Hmm, this is quite off topic, but that BPE code technically supports any language, since it’s byte-level, but most of the codes are English words so it would essentially be doing character-level modeling. We don’t have code released for creating your own BPE in this format, since the dictionary is borrowed from GPT-2. We are currently working on a multilingual version, but there is no expected date yet.

1reaction

stefan-itcommented, Aug 16, 2019

@3NFBAGDU Good question! A readme was recently added for pre-training RoBERTa:

https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md

But one problem could be, that a previously built dictionary is downloaded and used, see this line:

wget -O gpt2_bpe/dict.txt https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt

You can’t use that dictionary for a non-English language 🤔

Maybe @myleott could give a hint how to create such a dictionary for another corpus/language 🤗

Top Results From Across the Web

RoBERTa model problem #975 - facebookresearch/fairseq

Hi, I was trying to fine tune the roberta model with my own task with the implementation from this fantastic repo.

RoBERTa - Hugging Face

The RoBERTa model was proposed in RoBERTa: A Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar...

Overview of ROBERTa model - GeeksforGeeks

RoBERTa stands for Robustly Optimized BERT Pre-training Approach. It was presented by researchers at Facebook and Washington University.

Fine-Tunned on Roberta-base as NER problem [0.533] - Kaggle

Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at /kaggle/input/roberta-base/ and are newly initialized: ...

Fine-tune a RoBERTa Encoder-Decoder model trained on ...

First, I must admit that probably a text generation problem is not usually approached with this kind of solution, using encoders models like ......