Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[pretrained] model classes aren't checking the arch of the pretrained model it loads

See original GitHub issue

While comparing different models trained on xsum (most of which are Bart) I made a mistake and passed “google/pegasus-xsum” to BartForConditionalGeneration

BartForConditionalGeneration.from_pretrained("google/pegasus-xsum")

I got:

Some weights of the model checkpoint at google/pegasus-xsum were not used when initializing BartForConditionalGeneration: ['model.encoder.layer_norm.weight', 'model.encoder.layer_norm.bias', 'model.decoder.layer_norm.weight', 'model.decoder.layer_norm.bias']
- This IS expected if you are initializing BartForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BartForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BartForConditionalGeneration were not initialized from the model checkpoint at google/pegasus-xsum and are newly initialized: ['model.encoder.embed_positions.weight', 'model.encoder.layernorm_embedding.weight', 'model.encoder.layernorm_embedding.bias', 'model.decoder.embed_positions.weight', 'model.decoder.layernorm_embedding.weight', 'model.decoder.layernorm_embedding.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "./bart-summarize2.py", line 8, in <module>
    tokenizer = BartTokenizer.from_pretrained(mname)
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/tokenization_utils_base.py", line 1788, in from_pretrained
    return cls._from_pretrained(
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/tokenization_utils_base.py", line 1860, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/models/roberta/tokenization_roberta.py", line 159, in __init__
    super().__init__(
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/models/gpt2/tokenization_gpt2.py", line 179, in __init__
    with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType

Any reason why the model class doesn’t check that it’s being fed a wrong architecture? It could detect that and give a corresponding error message, rather than spitting random errors like above? I was pretty sure it was a bug in pegasus model until I noticed that pegasus != Bart.

Thanks.

@LysandreJik

Issue Analytics

State:
Created 3 years ago
Comments:12 (10 by maintainers)

Top GitHub Comments

1reaction

vimarshccommented, Mar 7, 2021

Hi, I’ve made some progress on this issue. Think I’ve fixed it for initiating models. To show if my approach is fine shall I submit a PR?

I’ve essentially added an assert statement in the from_pretrained method in the PretrainedConfig class.

1reaction

ankh6commented, Feb 23, 2021

Hi @LysandreJik Does someone work on that ? I’d like to make my first contribution to the project

Top Results From Across the Web

Loading pre-trained weights from a local file rather than from a ...

I've tried specifying the parameter pretrained_fnames as in this old topic but then got a “missing 1 required positional argument: 'arch'” error ...

Models - Hugging Face

The warning Weights from XXX not initialized from pretrained model means that the weights of XXX do not come pretrained with the rest...

How to load part of pre trained model? - PyTorch Forums

Splitting Pre-Trained Model by its Parameters. How to tranfer weight of trained model and map on which have fewer classes?

Evaluating Pre-trained Models — fairseq 0.12.2 documentation

First, download a pre-trained model along with its vocabularies: ... --bpe-codes $MODEL_DIR/bpecodes | loading model(s) from wmt14.en-fr.fconv-py/model.pt ...

Finetuning Pre-trained Building Footprint Model

Load training data ... Please check your dataset. 3 images dont have the corresponding label files. Visualize training data. To get a sense...