question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Loading fine_tuned BertModel fails due to prefix error

See original GitHub issue

I am loading a pretrained BERT model with BertModel.from_pretrained as I feed the pooled_output representation directly to a loss without a head. After fine-tuning the model, I save it as in run_classifier.py. Afterwards, I want to load the fine-tuned model, again without a head, so I’m using BertModel.from_pretrained model again to initialize it, this time from the directory where the config and model files are stored. When trying to load the pretrained model, none of the weights are found and I get:

Weights of BertModel not initialized from pretrained model: ['bert.embeddings.word_embeddings.weight'
, 'bert.embeddings.position_embeddings.weight', 'bert.embeddings.token_type_embeddings.weight', 'bert
.embeddings.LayerNorm.weight', 'bert.embeddings.LayerNorm.bias', 'bert.encoder.layer.0.attention.self
.query.weight', 'bert.encoder.layer.0.attention.self.query.bias', 'bert.encoder.layer.0.attention.self.key.weight', ...]

This seems to be due to this line in modeling.py. As BertModel.from_pretrained does not create a bert attribute (in contrast to the BertModels with a head), the bert. prefix is used erroneously instead of the '' prefix, which causes the weights of the fine-tuned model not to be found. If I change this line to check additionally if we load a fine-tuned model, then this works:

load(model, prefix='' if hasattr(model, 'bert') or pretrained_model_name not in PRETRAINED_MODEL_ARCHIVE_MAP else 'bert.')

Does this make sense? Let me know if I’m using BertModel.from_pretrained in the wrong way or if I should be using a different model for fine-tuning if I just care about the pooled_output representation.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
thomwolfcommented, Jan 24, 2019

Actually Sebastian, since the model you save and the model you load are instances of the same BertModel class, you can also simply use the standard PyTorch serialization practice (we only have a special from_pretrained loading function to be able to load various type of models using the same pre-trained model stored on AWS).

Just build a new BertModel using the configuration file you saved.

Here is a snippet :

# Saving (same as you did)
model_to_save = model_base.module if hasattr(model_base, 'module') else model_base
torch.save(model_to_save.state_dict(), save_file)
with open(config_file, 'w') as f:
   f.write(model_base.config.to_json_string())

# Loading (using standard PyTorch loading practice)
config = BertConfig(config_file)
model = BertModel(config)
model.load_state_dict(torch.load(save_file))
0reactions
tsivagurucommented, Jun 18, 2020

Hi All,

iam facing following issue while loading pretrained BERT Sequence model with my own data

RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: “module.out.weight”, “module.out.bias”. Unexpected key(s) in state_dict: “bert.embeddings.word_embeddings.weight”, “bert.embeddings.position_embeddings.weight”, “bert.embeddings.token_type_embeddings.weight”, “bert.embeddings.LayerNorm.weight”, “bert.embeddings.LayerNorm.bias”, “bert.encoder.layer.0.attention.self.query.weight”, “bert.encoder.layer.0.attention.self.query.bias”, “bert.encoder.layer.0.attention.self.key.weight”, “bert.encoder.layer.0.attention.self.key.bias”, “bert.encoder.layer.0.attention.self.value.weight”, “bert.encoder.layer.0.attention.self.value.bias”, “bert.encoder.layer.0.attention.output.dense.weight”, “bert.encoder.layer.0.attention.output.dense.bias”, “bert.encoder.layer.0.attention.output.LayerNorm.weight”, “bert.encoder.layer.0.attention.output.LayerNorm.bias”, “bert.encoder.layer.0.intermediate.dense.weight”, “bert.encoder.layer.0.intermediate.dense.bias”, “bert.encoder.layer.0.output.dense.weight”, “bert.encoder.layer.0.output.dense.bias”, “bert.encoder.layer.0.output.LayerNorm.weight”, “bert.encoder.layer.0.output.LayerNorm.bias”, “bert.encoder.layer.1.attention.self.query.weight”, “bert.encoder.layer.1.attention.self.query.bias”, “bert.encoder.layer.1.attention.self.key.weight”, “bert.encoder.layer.1.attention.self.key.bias”, “bert.encoder.layer.1.attention.self.value.weight”, “bert.encoder.layer.1.attention.self.value.bias”, “bert.encoder.layer.1.attention.output.dense.weight”, “bert.encoder.layer.1.attention.output.dense.bias”, "bert.encoder.layer.1.attention.output.LayerNorm…

any idea about this error

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error deploying BERT on SageMaker - Hugging Face Forums
I fine-tuned BERT for text-classification on a custom dataset using HuggingFace and Tensorflow, and now I'm trying to deploy the model for ...
Read more >
Failed to load(restore) TensorFlow checkpoint when running ...
I am answering my own question. I have just solved the problem by simply remove the slash( / ) in the $BERT_BASE_DIR ,...
Read more >
Fine-tuning a PyTorch BERT model and deploying it with ...
This post demonstrates how to use Amazon SageMaker to fine-tune a PyTorch BERT model and deploy it with Elastic Inference.
Read more >
Find Answers to AWS Questions about Amazon SageMaker | AWS ...
[bug report] Sagemaker data wrangler: An error occurred loading this view ... Not able to convert Hugging Face fine-tuned BERT model into AWS...
Read more >
Rasa Open Source Change Log
#11395: Fixes a bug that lead to initial slot values being ... when TEDPolicy or UnexpecTEDIntentPolicy is not loaded in finetune mode.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found