Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: "best loss", when loading checkpoint as Fairseq Model

See original GitHub issue

Hi guys,

Thank you for the incredible work.

I tried to load this model from the larger checkpoint in the following manner:

from fairseq.models.transformer import TransformerModel

model = TransformerModel.from_pretrained(model_name_or_path=MODEL_DIR,  \
                                         checkpoint_file='prophetnet_large_pretrained_160G_14epoch_model.pt')

but was presented with a key error:

KeyError                                  Traceback (most recent call last)
<ipython-input-13-782ea15f21fd> in <module>()
      1 MODEL_DIR = '/content/drive/My Drive/src/models/'
----> 2 model = TransformerModel.from_pretrained(model_name_or_path=MODEL_DIR,                                         checkpoint_file='prophetnet_large_pretrained_160G_14epoch_model.pt')

4 frames
/usr/local/lib/python3.6/dist-packages/fairseq/checkpoint_utils.py in _upgrade_state_dict(state)
    298     if "optimizer_history" not in state:
    299         state["optimizer_history"] = [
--> 300             {"criterion_name": "CrossEntropyCriterion", "best_loss": state["best_loss"]}
    301         ]
    302         state["last_optimizer_state"] = state["optimizer"]

KeyError: 'best_loss'

Versions fairseq==0.9.0 torch==1.4.0

Any advice on how to proceed would be greatly appreciated, I wish to load ProphetNet into a fairseq model so I can adapt the architecture to a custom task.

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:7

Top GitHub Comments

1reaction

c976237222commented, May 18, 2022

@chrisdoyleIE How do you solve the problem that I encountered the same error thanks

1reaction

steve3p0commented, Mar 5, 2022

That code is no longer available with the link you provided. You could please tell me where I can find it?

Thanks!

Hi, this happens because we remove the useless optimization history logs from the model to reduce the file size. Only the desired model weights are kept to release. As a result, if you directly load the model, error will be reported that some logs are missed. You can refer to [this code](https://github.com/microsoft/ProphetNet/blob/master/src/prophetnet/ngram_s2s_model.py#L146) with the function model.load_state_dict(states) to load our pretrained weights.

Top Results From Across the Web

See raw diff - Hugging Face

+- :ref:`Criterions` compute the loss function given the model outputs and ... 24 types + | loading model from checkpoints/checkpoint_best.pt + + Input: ......

Error when resuming training: No 'step' key in optimizer state ...

The question concerns a strange error that I encountered when using fairseq, but I believe the issue concerns PyTorch in general.

Continue training with torch.save and torch.load - key error ...

load in front of the loop, but I got "key error" messages for the optimizer, loss and epoch call. Should I have made...

scripts/average_checkpoints.py · master · xuchen / Fairseq-S2T ...

Args: inputs: An iterable of string paths of checkpoints to load from. Returns: A dict of string keys mapping to various values. The...

Python load state dict - ProgramCreek.com

def load_state_dict(self, state_dict): """ Load from the saved state dict. This can be used to resume an experiment from a checkpoint (see 'state_dict'...