Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to finetune with a new dataset?

See original GitHub issue

Hi, I am trying to finetune PRIMERA from huggingface using trainer, with a new dataset. However, i keep getting rouge scores of 0. May I know which part of the code is wrong?

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments import nltk import numpy as np TOKENIZER = AutoTokenizer.from_pretrained("allenai/PRIMERA") MODEL = AutoModelForSeq2SeqLM.from_pretrained("allenai/PRIMERA") import torch MODEL.gradient_checkpointing_enable() PAD_TOKEN_ID = TOKENIZER.pad_token_id DOCSEP_TOKEN_ID = TOKENIZER.convert_tokens_to_ids("<doc-sep>")

from huggingface_hub import notebook_login notebook_login()

here i load my own reformatted version of the multi_news dataset from huggingface - format is a (src,tgt) pair, where src is the related documents, tgt is the summary. its almost the same as the original multi_news dataset, just that i added a few more words at the front along with |||||.

train = load_dataset('cammy/multi_news_formatted_small', split='train[:100]', use_auth_token=True, cache_dir="D:") valid = load_dataset('cammy/multi_news_formatted_small', split='valid[:10]', use_auth_token=True, cache_dir="D:") test = load_dataset('cammy/multi_news_formatted_small', split='test[:10]', use_auth_token=True, cache_dir="D:")

then i do the preprocessing of data

then lastly: trainer.train()

but these are the results:

Issue Analytics

State:
Created 2 years ago
Comments:14

Top GitHub Comments

1reaction

JohnGiorgicommented, Jul 22, 2022

I did have the issue quite a while ago, and it has disappeared for me. There was a bug a while back where the Seq2SeqTrainer function was not taking into account the global_attention_mask which may have been the problem? Might be worth updating transformers to the latest version (if you haven’t already) and trying again.

0reactions

zhangzx-uiuccommented, Jul 23, 2022

Hi @JohnGiorgi, thanks for your reply! However I am still having this problem of running your provided script (https://gist.github.com/JohnGiorgi/8c7dcabd3ee8a362b9174c5d145029ab) with the newest version of transformers==4.21.0.dev0. I used the following command to run (on a 8*32GB V100 EC2 instance):

python run_summarization.py \
    --model_name_or_path allenai/PRIMERA \
    --do_train \
    --do_eval \
    --dataset_name multi_news \
    --dataset_config "3.0.0" \
    --source_prefix "summarize: " \
    --output_dir ./outputs \
    --per_device_train_batch_size=4 \
    --per_device_eval_batch_size=4 \
    --overwrite_output_dir \
    --predict_with_generate

The evaluation results are:

***** eval metrics *****
  epoch                   =        3.0
  eval_gen_len            =      128.0
  eval_loss               =     2.0331
  eval_rouge1             =        0.0
  eval_rouge2             =        0.0
  eval_rougeL             =        0.0
  eval_rougeLsum          =        0.0
  eval_runtime            = 0:11:05.88
  eval_samples            =       5621
  eval_samples_per_second =      8.441
  eval_steps_per_second   =      0.264

Not sure what causes this problem but there must still be something wrong with the generation method in huggingface implementations. But anyway, thanks much for your script and it is really helpful.

Top Results From Across the Web

How to fine-tune a model for common downstream tasks

This guide will show you how to fine-tune Transformers models for common downstream tasks. You will use the Datasets library to quickly load...

Fine Tune Pre-trained BERT model on new dataset(and vocab)

Hi,. I want to fine tune a pre-trained BERT model (because training a good one from scratch is very resource consuming). I know...

Fine Tune Transformers Model like BERT on Custom Dataset.

Learn How to Fine Tune BERT on Custom Dataset.In this video, I have explained how Finetune transformers models like BERT on the custom ......

14.2. Fine-Tuning - Dive into Deep Learning

14.2.1. Steps¶ · Pretrain a neural network model, i.e., the source model, on a source dataset (e.g., the ImageNet dataset). · Create a...

Transfer learning and fine-tuning | TensorFlow Core

Fine-tuning a pre-trained model: To further improve performance, one might want to repurpose the top-level layers of the pre-trained models to the new...