question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RoBERTa model problem

See original GitHub issue

Hi,

I was trying to fine tune the roberta model with my own task with the implementation from this fantastic repo. However I do encounter one significant problem.

I trained the model on Google Colab and saved it with torch.save(model.state_dict(), f"/content/drive/My Drive/roberta/models/state_fold{i}") and then load the model with model.load_state_dict(torch.load(path, map_location='cpu')) on my local machine, where the method extract_features would just return the same output regardless the input. I have been using a workaround by fix all the parameters of roberta when training and reload the roberta with self.roberta = torch.hub.load('pytorch/fairseq', 'roberta.base') after I load the state_dict, which fixed the issue but still kind of not satisfying as I cannot finetune the model but only the classification heads.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
myleottcommented, Aug 19, 2019

Hmm, this is quite off topic, but that BPE code technically supports any language, since it’s byte-level, but most of the codes are English words so it would essentially be doing character-level modeling. We don’t have code released for creating your own BPE in this format, since the dictionary is borrowed from GPT-2. We are currently working on a multilingual version, but there is no expected date yet.

1reaction
stefan-itcommented, Aug 16, 2019

@3NFBAGDU Good question! A readme was recently added for pre-training RoBERTa:

https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md

But one problem could be, that a previously built dictionary is downloaded and used, see this line:

wget -O gpt2_bpe/dict.txt https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/dict.txt

You can’t use that dictionary for a non-English language 🤔

Maybe @myleott could give a hint how to create such a dictionary for another corpus/language 🤗

Read more comments on GitHub >

github_iconTop Results From Across the Web

RoBERTa model problem #975 - facebookresearch/fairseq
Hi, I was trying to fine tune the roberta model with my own task with the implementation from this fantastic repo.
Read more >
RoBERTa - Hugging Face
The RoBERTa model was proposed in RoBERTa: A Robustly Optimized BERT Pretraining Approach by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar...
Read more >
Overview of ROBERTa model - GeeksforGeeks
RoBERTa stands for Robustly Optimized BERT Pre-training Approach. It was presented by researchers at Facebook and Washington University.
Read more >
Fine-Tunned on Roberta-base as NER problem [0.533] - Kaggle
Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at /kaggle/input/roberta-base/ and are newly initialized: ...
Read more >
Fine-tune a RoBERTa Encoder-Decoder model trained on ...
First, I must admit that probably a text generation problem is not usually approached with this kind of solution, using encoders models like ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found