Train RobertaModel from scratch for my dataset
See original GitHub issueI am trying to train RobertaModel from scratch. I am following this blog but instead of model = RobertaForMaskedLM(config=config)
, I am starting with configuration = RobertaConfig() model = RobertaModel(configuration)
and then continuing with other steps. But I am getting error TypeError: forward() got an unexpected keyword argument 'labels'
. The whole code piece:
configuration = RobertaConfig()
model = RobertaModel(configuration)
from transformers import LineByLineTextDataset
dataset = LineByLineTextDataset(
tokenizer=tokenizer,
file_path="./train.txt",
block_size=128,
)
from transformers import DataCollatorForLanguageModeling
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer, mlm=False, mlm_probability=0.15
)
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./Model1",
overwrite_output_dir=True,
num_train_epochs=1,
per_gpu_train_batch_size=64,
save_steps=10_000,
save_total_limit=2,
)
trainer = Trainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=dataset,
prediction_loss_only=True,
)
trainer.train()
Is there some other way to do pre-training? Am i missing something here?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:9 (4 by maintainers)
Top Results From Across the Web
Create a Tokenizer and Train a Huggingface RoBERTa Model ...
For our experiment, we are going to train from scratch a RoBERTa model, it will become the encoder and the decoder of a...
Read more >Train Roberta from scratch for custom dataset - Intermediate
Hey there, I am training Roberta from scratch for protein sequences. To this end, I build a tokenizer for protein sequences, which is...
Read more >Training RoBERTa from scratch - the missing guide
At this point, I've decided to go with RoBERTa model. Model you choose determines the tokenizer that you will have to train.
Read more >Huggingface Transformers: Retraining roberta-base using the ...
The RoBERTa model (Liu et al., 2019) introduces some key ... This post does not delve into training the LM and tokenizer from...
Read more >Pretraining a RoBERTa Model from Scratch - Packt Subscription
In this chapter, we will build a RoBERTa model from scratch. The model will take the bricks of the Transformer construction kit we...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You can train
RobertaForMaskedLM
using themlm
objective and then load it inRoberatForSequenceClassification
for classification .RoberatForSequenceClassification
will take care of taking last layer vector and feeding it to a classification layer.This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.