question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Train RobertaModel from scratch for my dataset

See original GitHub issue

I am trying to train RobertaModel from scratch. I am following this blog but instead of model = RobertaForMaskedLM(config=config), I am starting with configuration = RobertaConfig() model = RobertaModel(configuration) and then continuing with other steps. But I am getting error TypeError: forward() got an unexpected keyword argument 'labels'. The whole code piece:

configuration = RobertaConfig()
model = RobertaModel(configuration)
from transformers import LineByLineTextDataset
dataset = LineByLineTextDataset(
    tokenizer=tokenizer,
    file_path="./train.txt",
    block_size=128,
)
from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm=False, mlm_probability=0.15
)
from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./Model1",
    overwrite_output_dir=True,
    num_train_epochs=1,
    per_gpu_train_batch_size=64,
    save_steps=10_000,
    save_total_limit=2,
)

trainer = Trainer(
    model=model,
    args=training_args,
    data_collator=data_collator,
    train_dataset=dataset,
    prediction_loss_only=True,
)
trainer.train()

Is there some other way to do pre-training? Am i missing something here?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
patil-surajcommented, Jun 23, 2020

You can train RobertaForMaskedLM using the mlm objective and then load it in RoberatForSequenceClassification for classification . RoberatForSequenceClassification will take care of taking last layer vector and feeding it to a classification layer.

0reactions
stale[bot]commented, Aug 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Create a Tokenizer and Train a Huggingface RoBERTa Model ...
For our experiment, we are going to train from scratch a RoBERTa model, it will become the encoder and the decoder of a...
Read more >
Train Roberta from scratch for custom dataset - Intermediate
Hey there, I am training Roberta from scratch for protein sequences. To this end, I build a tokenizer for protein sequences, which is...
Read more >
Training RoBERTa from scratch - the missing guide
At this point, I've decided to go with RoBERTa model. Model you choose determines the tokenizer that you will have to train.
Read more >
Huggingface Transformers: Retraining roberta-base using the ...
The RoBERTa model (Liu et al., 2019) introduces some key ... This post does not delve into training the LM and tokenizer from...
Read more >
Pretraining a RoBERTa Model from Scratch - Packt Subscription
In this chapter, we will build a RoBERTa model from scratch. The model will take the bricks of the Transformer construction kit we...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found