Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine-tune BERTForMaskedLM

See original GitHub issue

Hello,

I am doing a project on spelling correction. I used pre-trained “bert-base-cased” model. However, the results are not that accurate. Therefore, I planned to fine-tune the BERT for Masked LM task. I couldn’t find any examples for fine-tuning BERT model for Masked LM. I tried to use “run_language_modeling.py” for fine-tuning. But, I came across with the following error:

C:\Users\ravida6d\spell_correction\transformers\examples\language-modeling>python run_language_modeling.py --output_dir ="C:\\Users\\ravida6d\\spell_correction\\contextualSpellCheck\\fine_tune\\" --model_type = bert --model_name_or_path = bert-base-cased --do_train --train_data_file =$TRAIN_FILE --do_eval --eval_data_file =$TEST_FILE –mlm

C:\Users\ravida6d\AppData\Local\Continuum\anaconda3\envs\contextualSpellCheck\lib\site-packages\transformers\training_args.py:291: FutureWarning: The `evaluate_during_training` argument is deprecated in favor of `evaluation_strategy` (which has more options)
  FutureWarning,

Traceback (most recent call last):
  File "run_language_modeling.py", line 313, in <module>
    main()
  File "run_language_modeling.py", line 153, in main
    model_args, data_args, training_args = parser.parse_args_into_dataclasses()
  File "C:\Users\ravida6d\AppData\Local\Continuum\anaconda3\envs\contextualSpellCheck\lib\site-packages\transformers\hf_argparser.py", line 151, in parse_args_into_dataclasses

    raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
ValueError: Some specified arguments are not used by the HfArgumentParser: ['bert', 'bert-base-cased']

I am not understanding how to use this script. Can anyone give some information for understanding the fine-tuning of BERT Masked LM.

Issue Analytics

State:
Created 3 years ago
Comments:7 (1 by maintainers)

Top GitHub Comments

1reaction

naturecreatorcommented, Sep 29, 2020

While fine-tuning, we can only see loss and perplexity which is useful. Is it also possible to see the accuracy of the model and also the tensorboard when using the “run_language_modeling.py” script? It would be really helpful if anyone could explain how the “loss” is calculated for BERTForMaskedLM task (as there are no labels provided while fine-tuning).

0reactions

ucas010commented, Nov 28, 2022

hi,dear how to use Spelling Error Correction with this rp? could you pls help me ?

Top Results From Across the Web

Fine-tuning BERT Model on domain specific language and for ...

Hi guys First of all, what I am trying to do: I want to fine-tune a BERT Model ... We are using BertForMaskedLM...

Finetune and generate text with BertForMaskedLM · Issue #2119

Questions & Help I am trying to fine-tune and generate text using BertForMaskedLM. Although my script works I am not getting the output...

Fine Tuning BERT using Masked Language Modelling

Hello, In this tutorial, we are going to fine-tune or pre-train our BERT model (from the huggingface transformers) using a famous ...

Fine-tune a BERT model for context specific embeddigns

The model that is used is one of the BERTForLM familly. The idea is to create a dataset using the TextDataset that tokenizes...

How to fine-tune BERT using HuggingFace - Joe Cummings

HuggingFace offers several versions of the BERT model including a base BertModel , BertLMHeadMoel , BertForPretraining , BertForMaskedLM ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Fine-tune BERTForMaskedLM

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Upload models using transformers-cli fails

Possible error in MBart Tokenization script -- target lang code is only present in seq once