Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

convert_tf_checkpoint_to_pytorch 'BertPreTrainingHeads' object has no attribute 'squad'

See original GitHub issue

Trying to convert BERT checkpoints to pytorch checkpoints. It worked for default uncased bert_model.ckpt. However, after we did a custom training of tensorflow version and then tried to convert TF checkpoints to pytorch, it is giving error: ‘BertPreTrainingHeads’ object has no attribute ‘squad’
When printed

elif l[0] == 'output_bias' or l[0] == 'beta':
                pointer = getattr(pointer, 'bias')
            elif l[0] == 'output_weights':
                pointer = getattr(pointer, 'weight')
            else:
                print("--> ", str(l))  ############### printed this
                print("==> ", str(pointer)) ################# printed this
                pointer = getattr(pointer, l[0])

output:

--> ['squad']
==> BertPreTrainingHeads(
  (predictions): BertLMPredictionHead(
    (transform): BertPredictionHeadTransform(
      (dense): Linear(in_features=768, out_features=768, bias=True)
      (LayerNorm): BertLayerNorm()
    )
    (decoder): Linear(in_features=768, out_features=30522, bias=False)
  )
  (seq_relationship): Linear(in_features=768, out_features=2, bias=True)
)

Can you please tell us what is happening? Does tensorflow add something during finetuning? Not sure from where squad word got into tensorflow ckpt file.
And, what needs to be done to fix this?
Are you planning to fix this and release updated code?

Issue Analytics

State:
Created 4 years ago
Comments:14 (4 by maintainers)

Top GitHub Comments

9reactions

Hya-cinthuscommented, Nov 20, 2019

A possible solution if you’re copying a SQuAD-fine-tuned Bert from TF to PT

Issue: AttributeError: 'BertPreTrainingHeads' object has no attribute 'classifier'

It works for me by doing the following steps:

Step 1. In the script convert_tf_checkpoint_to_pytorch.py (or convert_bert_original_tf_checkpoint_to_pytorch.py):

Replace all BertForPreTraining with BertForQuestionAnswering.

Step 2. Open the source code file modeling_bert.py in your package site-packages\transformers:

In the function load_tf_weights_in_bert, replace elif l[0] == 'squad': pointer = getattr(pointer, 'classifier') with elif l[0] == 'squad': pointer = getattr(pointer, 'qa_outputs')

It should work since qa_outputs is the attribute name for the output layer of BertForQuestionAnswering instead of classifier.

Step 3. After copying, check your pytorch model by evaluating the dev-v2.0.json with a script like this: python run_squad.py --model_type bert --model_name_or_path MODEL_PATH --do_eval --train_file None --predict_file dev-v2.0.json --max_seq_length 384 --doc_stride 128 --output_dir ./output/ --version_2_with_negative where output_dir should contain a copy of the pytorch model.

This will result in an evaluation like this: { "exact": 72.99755748336563, "f1": 76.24686988414918, "total": 11873, "HasAns_exact": 72.82388663967612, "HasAns_f1": 79.33182964482165, "HasAns_total": 5928, "NoAns_exact": 73.17073170731707, "NoAns_f1": 73.17073170731707, "NoAns_total": 5945, "best_exact": 74.3619978101575, "best_exact_thresh": -3.6369030475616455, "best_f1": 77.12234803941384, "best_f1_thresh": -3.6369030475616455 } for a BERT-Base model.

However, if using BertForTokenClassification instead, the model will not be correctly copied since the structures for the classification layer are different. I tried this and got a model that had a f1 score of 10%.

3reactions

thomwolfcommented, Apr 3, 2019

Hi @SandeepBhutani, I pushed a commit to master which should help you do this kind of thing.

First, switch to master by cloning the repo and then follow the following instructions:

The convert_tf_checkpoint_to_pytorch conversion script is made to create BertForPretraining model which is not your use case but you can load another type of model by reproducing the behavior of this script as follows:

from pytorch_pretrained_bert import BertConfig, BertForTokenClassification, load_tf_weights_in_bert

# Initialise a configuration according to your model
config = BertConfig.from_pretrained('bert-XXX-XXX')

# You will need to load a BertForTokenClassification model
model = BertForTokenClassification(config)

# Load weights from tf checkpoint
load_tf_weights_in_bert(model, tf_checkpoint_path)

# Save pytorch-model
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)