convert_tf_checkpoint_to_pytorch 'BertPreTrainingHeads' object has no attribute 'squad'
See original GitHub issueTrying to convert BERT checkpoints to pytorch checkpoints. It worked for default uncased bert_model.ckpt. However, after we did a custom training of tensorflow version and then tried to convert TF checkpoints to pytorch, it is giving error: ‘BertPreTrainingHeads’ object has no attribute ‘squad’
When printed
elif l[0] == 'output_bias' or l[0] == 'beta':
pointer = getattr(pointer, 'bias')
elif l[0] == 'output_weights':
pointer = getattr(pointer, 'weight')
else:
print("--> ", str(l)) ############### printed this
print("==> ", str(pointer)) ################# printed this
pointer = getattr(pointer, l[0])
output:
--> ['squad']
==> BertPreTrainingHeads(
(predictions): BertLMPredictionHead(
(transform): BertPredictionHeadTransform(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): BertLayerNorm()
)
(decoder): Linear(in_features=768, out_features=30522, bias=False)
)
(seq_relationship): Linear(in_features=768, out_features=2, bias=True)
)
- Can you please tell us what is happening? Does tensorflow add something during finetuning? Not sure from where squad word got into tensorflow ckpt file.
- And, what needs to be done to fix this?
- Are you planning to fix this and release updated code?
Issue Analytics
- State:
- Created 4 years ago
- Comments:14 (4 by maintainers)
Top Results From Across the Web
BertModel transformers outputs string instead of tensor
I'm following this tutorial that codes a sentiment analysis classifier using BERT with the huggingface library and I'm ...
Read more >What to do when you get an error - Hugging Face Course
Once you've logged in, you can copy the template repository with the ... Here, reading the error message tells us that 'list' object...
Read more >Text Extraction with BERT - Keras
Preprocess the data. Go through the JSON file and store every record as a SquadExample object. Go through each SquadExample and create x_train ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

A possible solution if you’re copying a SQuAD-fine-tuned Bert from TF to PT
Issue:
AttributeError: 'BertPreTrainingHeads' object has no attribute 'classifier'It works for me by doing the following steps:
Step 1. In the script
convert_tf_checkpoint_to_pytorch.py(orconvert_bert_original_tf_checkpoint_to_pytorch.py):BertForPreTrainingwithBertForQuestionAnswering.Step 2. Open the source code file
modeling_bert.pyin your packagesite-packages\transformers:load_tf_weights_in_bert, replaceelif l[0] == 'squad':pointer = getattr(pointer, 'classifier')withelif l[0] == 'squad':pointer = getattr(pointer, 'qa_outputs')It should work since
qa_outputsis the attribute name for the output layer ofBertForQuestionAnsweringinstead ofclassifier.Step 3. After copying, check your pytorch model by evaluating the
dev-v2.0.jsonwith a script like this:python run_squad.py --model_type bert --model_name_or_path MODEL_PATH --do_eval --train_file None --predict_file dev-v2.0.json --max_seq_length 384 --doc_stride 128 --output_dir ./output/ --version_2_with_negativewhereoutput_dirshould contain a copy of the pytorch model.This will result in an evaluation like this:
{ "exact": 72.99755748336563, "f1": 76.24686988414918, "total": 11873, "HasAns_exact": 72.82388663967612, "HasAns_f1": 79.33182964482165, "HasAns_total": 5928, "NoAns_exact": 73.17073170731707, "NoAns_f1": 73.17073170731707, "NoAns_total": 5945, "best_exact": 74.3619978101575, "best_exact_thresh": -3.6369030475616455, "best_f1": 77.12234803941384, "best_f1_thresh": -3.6369030475616455 }for aBERT-Basemodel.However, if using
BertForTokenClassificationinstead, the model will not be correctly copied since the structures for the classification layer are different. I tried this and got a model that had a f1 score of 10%.Hi @SandeepBhutani, I pushed a commit to master which should help you do this kind of thing.
First, switch to master by cloning the repo and then follow the following instructions:
The
convert_tf_checkpoint_to_pytorchconversion script is made to createBertForPretrainingmodel which is not your use case but you can load another type of model by reproducing the behavior of this script as follows: