question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RAG] Expected RAG output after fine tuning

See original GitHub issue

Hi there.

Perhaps the following isn’t even a real issue, but I’m a bit confused with the current outputs I got.

I’m trying to fine tune RAG on a bunch of question-answer pairs I have (for while, not that much, < 1k ones). I have splitted them as suggested (train.source, train.target, val.source…). After running the finetune_rag.py, the outputs generated were only two files (~2 kB):

  • git_log.json
  • hparams.pkl

Is that right? Because I was expecting a big binary file or something like that containing the weight matrices, so I could use them afterwards in a new trial.

Could you please tell me what’s the point I’m missing here?


I provide more details below. Btw, I have two NVIDIA RTX 3090, 24GB each, but they were barely used in the whole process (which took ~3 hours).

Command:

python finetune_rag.py \
    --data_dir rag_manual_qa_finetuning \
    --output_dir output_ft \
    --model_name_or_path rag-sequence-base \
    --model_type rag_sequence \
    --gpus 2 \
    --distributed_retriever pytorch

Logs (in fact, it’s strange but the logs even seem to be generated in duplicate - I don’t know why):

loading configuration file rag-sequence-base/config.json
Model config RagConfig {
  "architectures": [
    "RagSequenceForGeneration"
  ],
  "dataset": "wiki_dpr",
  "dataset_split": "train",
  "do_deduplication": true,
  "do_marginalize": false,
  "doc_sep": " // ",
  "exclude_bos_score": false,
  "forced_eos_token_id": 2,
  "generator": {
    "_name_or_path": "",
    "_num_labels": 3,
    "activation_dropout": 0.0,
    "activation_function": "gelu",
    "add_bias_logits": false,
    "add_cross_attention": false,
    "add_final_layer_norm": false,
    "architectures": [
      "BartModel",
      "BartForMaskedLM",
      "BartForSequenceClassification"
    ],
    "attention_dropout": 0.0,
    "bad_words_ids": null,
    "bos_token_id": 0,
    "chunk_size_feed_forward": 0,
    "classif_dropout": 0.0,
    "classifier_dropout": 0.0,
    "d_model": 1024,
    "decoder_attention_heads": 16,
    "decoder_ffn_dim": 4096,
    "decoder_layerdrop": 0.0,
    "decoder_layers": 12,
    "decoder_start_token_id": 2,
    "diversity_penalty": 0.0,
    "do_sample": false,
    "dropout": 0.1,
    "early_stopping": false,
    "encoder_attention_heads": 16,
    "encoder_ffn_dim": 4096,
    "encoder_layerdrop": 0.0,
    "encoder_layers": 12,
    "encoder_no_repeat_ngram_size": 0,
    "eos_token_id": 2,
    "extra_pos_embeddings": 2,
    "finetuning_task": null,
    "force_bos_token_to_be_generated": false,
    "forced_bos_token_id": null,
    "forced_eos_token_id": 2,
    "gradient_checkpointing": false,
    "id2label": {
      "0": "LABEL_0",
      "1": "LABEL_1",
      "2": "LABEL_2"
    },
    "init_std": 0.02,
    "is_decoder": false,
    "is_encoder_decoder": true,
    "label2id": {
      "LABEL_0": 0,
      "LABEL_1": 1,
      "LABEL_2": 2
    },
    "length_penalty": 1.0,
    "max_length": 20,
    "max_position_embeddings": 1024,
    "min_length": 0,
    "model_type": "bart",
    "no_repeat_ngram_size": 0,
    "normalize_before": false,
    "normalize_embedding": true,
    "num_beam_groups": 1,
    "num_beams": 1,
    "num_hidden_layers": 12,
    "num_return_sequences": 1,
    "output_attentions": false,
    "output_hidden_states": false,
    "output_past": false,
    "output_scores": false,
    "pad_token_id": 1,
    "prefix": " ",
    "pruned_heads": {},
    "repetition_penalty": 1.0,
    "return_dict": false,
    "return_dict_in_generate": false,
    "scale_embedding": false,
    "sep_token_id": null,
    "static_position_embeddings": false,
    "task_specific_params": {
      "summarization": {
        "early_stopping": true,
        "length_penalty": 2.0,
        "max_length": 142,
        "min_length": 56,
        "no_repeat_ngram_size": 3,
        "num_beams": 4
      }
    },
    "temperature": 1.0,
    "tie_encoder_decoder": false,
    "tie_word_embeddings": true,
    "tokenizer_class": null,
    "top_k": 50,
    "top_p": 1.0,
    "torchscript": false,
    "transformers_version": "4.4.0.dev0",
    "use_bfloat16": false,
    "use_cache": true,
    "vocab_size": 50265
  },
  "index_name": "exact",
  "index_path": null,
  "is_encoder_decoder": true,
  "label_smoothing": 0.0,
  "max_combined_length": 300,
  "model_type": "rag",
  "n_docs": 5,
  "output_retrieved": false,
  "passages_path": null,
  "question_encoder": {
    "_name_or_path": "",
    "add_cross_attention": false,
    "architectures": [
      "DPRQuestionEncoder"
    ],
    "attention_probs_dropout_prob": 0.1,
    "bad_words_ids": null,
    "bos_token_id": null,
    "chunk_size_feed_forward": 0,
    "decoder_start_token_id": null,
    "diversity_penalty": 0.0,
    "do_sample": false,
    "early_stopping": false,
    "encoder_no_repeat_ngram_size": 0,
    "eos_token_id": null,
    "finetuning_task": null,
    "forced_bos_token_id": null,
    "forced_eos_token_id": null,
    "gradient_checkpointing": false,
    "hidden_act": "gelu",
    "hidden_dropout_prob": 0.1,
    "hidden_size": 768,
    "id2label": {
      "0": "LABEL_0",
      "1": "LABEL_1"
    },
    "initializer_range": 0.02,
    "intermediate_size": 3072,
    "is_decoder": false,
    "is_encoder_decoder": false,
    "label2id": {
      "LABEL_0": 0,
      "LABEL_1": 1
    },
    "layer_norm_eps": 1e-12,
    "length_penalty": 1.0,
    "max_length": 20,
    "max_position_embeddings": 512,
    "min_length": 0,
    "model_type": "dpr",
    "no_repeat_ngram_size": 0,
    "num_attention_heads": 12,
    "num_beam_groups": 1,
    "num_beams": 1,
    "num_hidden_layers": 12,
    "num_return_sequences": 1,
    "output_attentions": false,
    "output_hidden_states": false,
    "output_scores": false,
    "pad_token_id": 0,
    "position_embedding_type": "absolute",
    "prefix": null,
    "projection_dim": 0,
    "pruned_heads": {},
    "repetition_penalty": 1.0,
    "return_dict": false,
    "return_dict_in_generate": false,
    "sep_token_id": null,
    "task_specific_params": null,
    "temperature": 1.0,
    "tie_encoder_decoder": false,
    "tie_word_embeddings": true,
    "tokenizer_class": null,
    "top_k": 50,
    "top_p": 1.0,
    "torchscript": false,
    "transformers_version": "4.4.0.dev0",
    "type_vocab_size": 2,
    "use_bfloat16": false,
    "use_cache": true,
    "vocab_size": 30522
  },
  "reduce_loss": false,
  "retrieval_batch_size": 8,
  "retrieval_vector_size": 768,
  "title_sep": " / ",
  "use_cache": true,
  "use_dummy_dataset": false,
  "vocab_size": null
}

Model name 'rag-sequence-base' not found in model shortcut name list (facebook/dpr-question_encoder-single-nq-base, facebook/dpr-question_encoder-multiset-base). Assuming 'rag-sequence-base' is a path, a model identifier, or url to a directory containing tokenizer files.
Didn't find file rag-sequence-base/question_encoder_tokenizer/tokenizer.json. We won't load it.
Didn't find file rag-sequence-base/question_encoder_tokenizer/added_tokens.json. We won't load it.
loading file rag-sequence-base/question_encoder_tokenizer/vocab.txt
loading file None
loading file None
loading file rag-sequence-base/question_encoder_tokenizer/special_tokens_map.json
loading file rag-sequence-base/question_encoder_tokenizer/tokenizer_config.json
Model name 'rag-sequence-base' not found in model shortcut name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). Assuming 'rag-sequence-base' is a path, a model identifier, or url to a directory containing tokenizer files.
Didn't find file rag-sequence-base/generator_tokenizer/tokenizer.json. We won't load it.
Didn't find file rag-sequence-base/generator_tokenizer/added_tokens.json. We won't load it.
loading file rag-sequence-base/generator_tokenizer/vocab.json
loading file rag-sequence-base/generator_tokenizer/merges.txt
loading file None
loading file None
loading file rag-sequence-base/generator_tokenizer/special_tokens_map.json
loading file rag-sequence-base/generator_tokenizer/tokenizer_config.json
Loading passages from wiki_dpr
Downloading: 9.64kB [00:00, 10.8MB/s]                                           
Downloading: 67.5kB [00:00, 59.5MB/s]                                           
WARNING:datasets.builder:Using custom data configuration psgs_w100.nq.no_index-dummy=False,with_index=False
Downloading and preparing dataset wiki_dpr/psgs_w100.nq.no_index (download: 66.09 GiB, generated: 73.03 GiB, post-processed: Unknown size, total: 139.13 GiB) to /home/usp/.cache/huggingface/datasets/wiki_dpr/psgs_w100.nq.no_index-dummy=False,with_index=False/0.0.0/91b145e64f5bc8b55a7b3e9f730786ad6eb19cd5bc020e2e02cdf7d0cb9db9c1...
Downloading: 100%|█████████████████████████| 4.69G/4.69G [07:11<00:00, 10.9MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:27<00:00, 9.00MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:36<00:00, 8.47MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:37<00:00, 8.41MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:38<00:00, 8.36MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:40<00:00, 8.25MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:58<00:00, 7.45MB/s]

Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:58<00:00, 7.43MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:00<00:00, 7.34MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:04<00:00, 7.17MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:05<00:00, 7.13MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:07<00:00, 7.06MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:10<00:00, 6.94MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:24<00:00, 6.48MB/s]
Downloading: 100%|█████████████████████████| 1.32G/1.32G [03:27<00:00, 6.38MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:33<00:00, 6.21MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [04:57<00:00, 4.45MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:36<00:00, 8.47MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:28<00:00, 8.94MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:44<00:00, 8.03MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:55<00:00, 7.54MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:28<00:00, 8.92MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:28<00:00, 8.90MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:56<00:00, 7.49MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:19<00:00, 6.63MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:53<00:00, 7.63MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:00<00:00, 7.33MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:11<00:00, 6.92MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:14<00:00, 6.80MB/s]
Downloading: 100%|█████████████████████████| 1.32G/1.32G [03:06<00:00, 7.10MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:35<00:00, 6.16MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:50<00:00, 5.76MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:28<00:00, 8.93MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:32<00:00, 8.67MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:07<00:00, 7.05MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:53<00:00, 7.62MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:22<00:00, 6.56MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:47<00:00, 7.93MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:26<00:00, 9.06MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:40<00:00, 8.25MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:42<00:00, 8.17MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:54<00:00, 7.59MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:41<00:00, 8.22MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:18<00:00, 6.69MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:30<00:00, 8.83MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [03:00<00:00, 7.34MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:20<00:00, 9.44MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:24<00:00, 9.19MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:21<00:00, 9.38MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:18<00:00, 9.59MB/s]
Downloading: 100%|█████████████████████████| 1.33G/1.33G [02:19<00:00, 9.53MB/s]
0 examples [00:00, ? examples/s]2021-03-05 12:11:39.666323: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Dataset wiki_dpr downloaded and prepared to /home/usp/.cache/huggingface/datasets/wiki_dpr/psgs_w100.nq.no_index-dummy=False,with_index=False/0.0.0/91b145e64f5bc8b55a7b3e9f730786ad6eb19cd5bc020e2e02cdf7d0cb9db9c1. Subsequent calls will reuse this data.
loading weights file rag-sequence-base/pytorch_model.bin
All model checkpoint weights were used when initializing RagSequenceForGeneration.

Some weights of RagSequenceForGeneration were not initialized from the model checkpoint at rag-sequence-base and are newly initialized: ['rag.generator.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
loading configuration file rag-sequence-base/config.json
Model config RagConfig {
  "architectures": [
    "RagSequenceForGeneration"
  ],
  "dataset": "wiki_dpr",
  "dataset_split": "train",
  "do_deduplication": true,
  "do_marginalize": false,
  "doc_sep": " // ",
  "exclude_bos_score": false,
  "forced_eos_token_id": 2,
  "generator": {
    "_name_or_path": "",
    "_num_labels": 3,
    "activation_dropout": 0.0,
    "activation_function": "gelu",
    "add_bias_logits": false,
    "add_cross_attention": false,
    "add_final_layer_norm": false,
    "architectures": [
      "BartModel",
      "BartForMaskedLM",
      "BartForSequenceClassification"
    ],
    "attention_dropout": 0.0,
    "bad_words_ids": null,
    "bos_token_id": 0,
    "chunk_size_feed_forward": 0,
    "classif_dropout": 0.0,
    "classifier_dropout": 0.0,
    "d_model": 1024,
    "decoder_attention_heads": 16,
    "decoder_ffn_dim": 4096,
    "decoder_layerdrop": 0.0,
    "decoder_layers": 12,
    "decoder_start_token_id": 2,
    "diversity_penalty": 0.0,
    "do_sample": false,
    "dropout": 0.1,
    "early_stopping": false,
    "encoder_attention_heads": 16,
    "encoder_ffn_dim": 4096,
    "encoder_layerdrop": 0.0,
    "encoder_layers": 12,
    "encoder_no_repeat_ngram_size": 0,
    "eos_token_id": 2,
    "extra_pos_embeddings": 2,
    "finetuning_task": null,
    "force_bos_token_to_be_generated": false,
    "forced_bos_token_id": null,
    "forced_eos_token_id": 2,
    "gradient_checkpointing": false,
    "id2label": {
      "0": "LABEL_0",
      "1": "LABEL_1",
      "2": "LABEL_2"
    },
    "init_std": 0.02,
    "is_decoder": false,
    "is_encoder_decoder": true,
    "label2id": {
      "LABEL_0": 0,
      "LABEL_1": 1,
      "LABEL_2": 2
    },
    "length_penalty": 1.0,
    "max_length": 20,
    "max_position_embeddings": 1024,
    "min_length": 0,
    "model_type": "bart",
    "no_repeat_ngram_size": 0,
    "normalize_before": false,
    "normalize_embedding": true,
    "num_beam_groups": 1,
    "num_beams": 1,
    "num_hidden_layers": 12,
    "num_return_sequences": 1,
    "output_attentions": false,
    "output_hidden_states": false,
    "output_past": false,
    "output_scores": false,
    "pad_token_id": 1,
    "prefix": " ",
    "pruned_heads": {},
    "repetition_penalty": 1.0,
    "return_dict": false,
    "return_dict_in_generate": false,
    "scale_embedding": false,
    "sep_token_id": null,
    "static_position_embeddings": false,
    "task_specific_params": {
      "summarization": {
        "early_stopping": true,
        "length_penalty": 2.0,
        "max_length": 142,
        "min_length": 56,
        "no_repeat_ngram_size": 3,
        "num_beams": 4
      }
    },
    "temperature": 1.0,
    "tie_encoder_decoder": false,
    "tie_word_embeddings": true,
    "tokenizer_class": null,
    "top_k": 50,
    "top_p": 1.0,
    "torchscript": false,
    "transformers_version": "4.4.0.dev0",
    "use_bfloat16": false,
    "use_cache": true,
    "vocab_size": 50265
  },
  "index_name": "exact",
  "index_path": null,
  "is_encoder_decoder": true,
  "label_smoothing": 0.0,
  "max_combined_length": 300,
  "model_type": "rag",
  "n_docs": 5,
  "output_retrieved": false,
  "passages_path": null,
  "question_encoder": {
    "_name_or_path": "",
    "add_cross_attention": false,
    "architectures": [
      "DPRQuestionEncoder"
    ],
    "attention_probs_dropout_prob": 0.1,
    "bad_words_ids": null,
    "bos_token_id": null,
    "chunk_size_feed_forward": 0,
    "decoder_start_token_id": null,
    "diversity_penalty": 0.0,
    "do_sample": false,
    "early_stopping": false,
    "encoder_no_repeat_ngram_size": 0,
    "eos_token_id": null,
    "finetuning_task": null,
    "forced_bos_token_id": null,
    "forced_eos_token_id": null,
    "gradient_checkpointing": false,
    "hidden_act": "gelu",
    "hidden_dropout_prob": 0.1,
    "hidden_size": 768,
    "id2label": {
      "0": "LABEL_0",
      "1": "LABEL_1"
    },
    "initializer_range": 0.02,
    "intermediate_size": 3072,
    "is_decoder": false,
    "is_encoder_decoder": false,
    "label2id": {
      "LABEL_0": 0,
      "LABEL_1": 1
    },
    "layer_norm_eps": 1e-12,
    "length_penalty": 1.0,
    "max_length": 20,
    "max_position_embeddings": 512,
    "min_length": 0,
    "model_type": "dpr",
    "no_repeat_ngram_size": 0,
    "num_attention_heads": 12,
    "num_beam_groups": 1,
    "num_beams": 1,
    "num_hidden_layers": 12,
    "num_return_sequences": 1,
    "output_attentions": false,
    "output_hidden_states": false,
    "output_scores": false,
    "pad_token_id": 0,
    "position_embedding_type": "absolute",
    "prefix": null,
    "projection_dim": 0,
    "pruned_heads": {},
    "repetition_penalty": 1.0,
    "return_dict": false,
    "return_dict_in_generate": false,
    "sep_token_id": null,
    "task_specific_params": null,
    "temperature": 1.0,
    "tie_encoder_decoder": false,
    "tie_word_embeddings": true,
    "tokenizer_class": null,
    "top_k": 50,
    "top_p": 1.0,
    "torchscript": false,
    "transformers_version": "4.4.0.dev0",
    "type_vocab_size": 2,
    "use_bfloat16": false,
    "use_cache": true,
    "vocab_size": 30522
  },
  "reduce_loss": false,
  "retrieval_batch_size": 8,
  "retrieval_vector_size": 768,
  "title_sep": " / ",
  "use_cache": true,
  "use_dummy_dataset": false,
  "vocab_size": null
}

Model name 'rag-sequence-base' not found in model shortcut name list (facebook/dpr-question_encoder-single-nq-base, facebook/dpr-question_encoder-multiset-base). Assuming 'rag-sequence-base' is a path, a model identifier, or url to a directory containing tokenizer files.
Didn't find file rag-sequence-base/question_encoder_tokenizer/tokenizer.json. We won't load it.
Didn't find file rag-sequence-base/question_encoder_tokenizer/added_tokens.json. We won't load it.
loading file rag-sequence-base/question_encoder_tokenizer/vocab.txt
loading file None
loading file None
loading file rag-sequence-base/question_encoder_tokenizer/special_tokens_map.json
loading file rag-sequence-base/question_encoder_tokenizer/tokenizer_config.json

Model name 'rag-sequence-base' not found in model shortcut name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). Assuming 'rag-sequence-base' is a path, a model identifier, or url to a directory containing tokenizer files.
Didn't find file rag-sequence-base/generator_tokenizer/tokenizer.json. We won't load it.
Didn't find file rag-sequence-base/generator_tokenizer/added_tokens.json. We won't load it.
loading file rag-sequence-base/generator_tokenizer/vocab.json
loading file rag-sequence-base/generator_tokenizer/merges.txt
loading file None
loading file None
loading file rag-sequence-base/generator_tokenizer/special_tokens_map.json
loading file rag-sequence-base/generator_tokenizer/tokenizer_config.json
GPU available: True, used: True
INFO:lightning:GPU available: True, used: True
TPU available: False, using: 0 TPU cores
INFO:lightning:TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
INFO:lightning:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
lhoestqcommented, Mar 10, 2021

Hi ! If I recall correctly the model is saved using pytorch lightning on_save_checkpoint. So the issue might come from the checkpointing config at

https://github.com/huggingface/transformers/blob/2295d783d5787bcd4c99ea0ddb2a9403697fc126/examples/research_projects/rag/callbacks_rag.py#L36-L43

2reactions
LysandreJikcommented, Mar 6, 2021
Read more comments on GitHub >

github_iconTop Results From Across the Web

Fine-tuning a pretrained model - Hugging Face
In this tutorial, we will show you how to fine-tune a pretrained model from the Transformers library. In TensorFlow, models can be directly...
Read more >
RAG – Resilience Analysis Grid - Erik Hollnagel
A system is said to perform in a manner that is resilient when it sustains required operations under both expected and unexpected conditions...
Read more >
Justified vs. Rag Right | Fonts.com
The same text set rag right (lower) looks fine, but doesn't convey the same ... It generally results in better type texture and...
Read more >
Human RAG mutations: biochemistry and clinical implications
Mutations of the RAG genes in humans are associated with a broad ... in turn results in a lack of T cells and...
Read more >
Sentence Transformer Fine-Tuning (SetFit): Outperforming ...
enable significantly smaller models to achieve on-par results with larger models such as T5–11B. Overall the separation in SetFit and RAG leads ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found