question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to finetune with a new dataset?

See original GitHub issue

Hi, I am trying to finetune PRIMERA from huggingface using trainer, with a new dataset. However, i keep getting rouge scores of 0. May I know which part of the code is wrong?

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments import nltk import numpy as np TOKENIZER = AutoTokenizer.from_pretrained("allenai/PRIMERA") MODEL = AutoModelForSeq2SeqLM.from_pretrained("allenai/PRIMERA") import torch MODEL.gradient_checkpointing_enable() PAD_TOKEN_ID = TOKENIZER.pad_token_id DOCSEP_TOKEN_ID = TOKENIZER.convert_tokens_to_ids("<doc-sep>")

from huggingface_hub import notebook_login notebook_login()

here i load my own reformatted version of the multi_news dataset from huggingface - format is a (src,tgt) pair, where src is the related documents, tgt is the summary. its almost the same as the original multi_news dataset, just that i added a few more words at the front along with |||||.

train = load_dataset('cammy/multi_news_formatted_small', split='train[:100]', use_auth_token=True, cache_dir="D:") valid = load_dataset('cammy/multi_news_formatted_small', split='valid[:10]', use_auth_token=True, cache_dir="D:") test = load_dataset('cammy/multi_news_formatted_small', split='test[:10]', use_auth_token=True, cache_dir="D:")

then i do the preprocessing of data image

image

image

image

image

then lastly: trainer.train()

but these are the results: image

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:14

github_iconTop GitHub Comments

1reaction
JohnGiorgicommented, Jul 22, 2022

I did have the issue quite a while ago, and it has disappeared for me. There was a bug a while back where the Seq2SeqTrainer function was not taking into account the global_attention_mask which may have been the problem? Might be worth updating transformers to the latest version (if you haven’t already) and trying again.

0reactions
zhangzx-uiuccommented, Jul 23, 2022

Hi @JohnGiorgi, thanks for your reply! However I am still having this problem of running your provided script (https://gist.github.com/JohnGiorgi/8c7dcabd3ee8a362b9174c5d145029ab) with the newest version of transformers==4.21.0.dev0. I used the following command to run (on a 8*32GB V100 EC2 instance):

python run_summarization.py \
    --model_name_or_path allenai/PRIMERA \
    --do_train \
    --do_eval \
    --dataset_name multi_news \
    --dataset_config "3.0.0" \
    --source_prefix "summarize: " \
    --output_dir ./outputs \
    --per_device_train_batch_size=4 \
    --per_device_eval_batch_size=4 \
    --overwrite_output_dir \
    --predict_with_generate

The evaluation results are:

***** eval metrics *****
  epoch                   =        3.0
  eval_gen_len            =      128.0
  eval_loss               =     2.0331
  eval_rouge1             =        0.0
  eval_rouge2             =        0.0
  eval_rougeL             =        0.0
  eval_rougeLsum          =        0.0
  eval_runtime            = 0:11:05.88
  eval_samples            =       5621
  eval_samples_per_second =      8.441
  eval_steps_per_second   =      0.264

Not sure what causes this problem but there must still be something wrong with the generation method in huggingface implementations. But anyway, thanks much for your script and it is really helpful.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to fine-tune a model for common downstream tasks
This guide will show you how to fine-tune Transformers models for common downstream tasks. You will use the Datasets library to quickly load...
Read more >
Fine Tune Pre-trained BERT model on new dataset(and vocab)
Hi,. I want to fine tune a pre-trained BERT model (because training a good one from scratch is very resource consuming). I know...
Read more >
Fine Tune Transformers Model like BERT on Custom Dataset.
Learn How to Fine Tune BERT on Custom Dataset.In this video, I have explained how Finetune transformers models like BERT on the custom ......
Read more >
14.2. Fine-Tuning - Dive into Deep Learning
14.2.1. Steps¶ · Pretrain a neural network model, i.e., the source model, on a source dataset (e.g., the ImageNet dataset). · Create a...
Read more >
Transfer learning and fine-tuning | TensorFlow Core
Fine-tuning a pre-trained model: To further improve performance, one might want to repurpose the top-level layers of the pre-trained models to the new...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found