question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: "best loss", when loading checkpoint as Fairseq Model

See original GitHub issue

Hi guys,

Thank you for the incredible work.

I tried to load this model from the larger checkpoint in the following manner:

from fairseq.models.transformer import TransformerModel

model = TransformerModel.from_pretrained(model_name_or_path=MODEL_DIR,  \
                                         checkpoint_file='prophetnet_large_pretrained_160G_14epoch_model.pt')

but was presented with a key error:

KeyError                                  Traceback (most recent call last)
<ipython-input-13-782ea15f21fd> in <module>()
      1 MODEL_DIR = '/content/drive/My Drive/src/models/'
----> 2 model = TransformerModel.from_pretrained(model_name_or_path=MODEL_DIR,                                         checkpoint_file='prophetnet_large_pretrained_160G_14epoch_model.pt')

4 frames
/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py in _upgrade_state_dict(state)
    298     if "optimizer_history" not in state:
    299         state["optimizer_history"] = [
--> 300             {"criterion_name": "CrossEntropyCriterion", "best_loss": state["best_loss"]}
    301         ]
    302         state["last_optimizer_state"] = state["optimizer"]

KeyError: 'best_loss'

Versions fairseq==0.9.0 torch==1.4.0

Any advice on how to proceed would be greatly appreciated, I wish to load ProphetNet into a fairseq model so I can adapt the architecture to a custom task.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:7

github_iconTop GitHub Comments

1reaction
c976237222commented, May 18, 2022

@chrisdoyleIE How do you solve the problem that I encountered the same error thanks

1reaction
steve3p0commented, Mar 5, 2022

That code is no longer available with the link you provided. You could please tell me where I can find it?

Thanks!

Hi, this happens because we remove the useless optimization history logs from the model to reduce the file size. Only the desired model weights are kept to release. As a result, if you directly load the model, error will be reported that some logs are missed. You can refer to [this code](https://github.com/microsoft/ProphetNet/blob/master/src/prophetnet/ngram_s2s_model.py#L146) with the function model.load_state_dict(states) to load our pretrained weights.

Read more comments on GitHub >

github_iconTop Results From Across the Web

See raw diff - Hugging Face
+- :ref:`Criterions` compute the loss function given the model outputs and ... 24 types + | loading model from checkpoints/checkpoint_best.pt + + Input: ......
Read more >
Error when resuming training: No 'step' key in optimizer state ...
The question concerns a strange error that I encountered when using fairseq, but I believe the issue concerns PyTorch in general.
Read more >
Continue training with torch.save and torch.load - key error ...
load in front of the loop, but I got "key error" messages for the optimizer, loss and epoch call. Should I have made...
Read more >
scripts/average_checkpoints.py · master · xuchen / Fairseq-S2T ...
Args: inputs: An iterable of string paths of checkpoints to load from. Returns: A dict of string keys mapping to various values. The...
Read more >
Python load state dict - ProgramCreek.com
def load_state_dict(self, state_dict): """ Load from the saved state dict. This can be used to resume an experiment from a checkpoint (see 'state_dict'...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found