question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

convert_tf_checkpoint_to_pytorch 'BertPreTrainingHeads' object has no attribute 'squad'

See original GitHub issue

Trying to convert BERT checkpoints to pytorch checkpoints. It worked for default uncased bert_model.ckpt. However, after we did a custom training of tensorflow version and then tried to convert TF checkpoints to pytorch, it is giving error: ‘BertPreTrainingHeads’ object has no attribute ‘squad’
When printed

elif l[0] == 'output_bias' or l[0] == 'beta':
                pointer = getattr(pointer, 'bias')
            elif l[0] == 'output_weights':
                pointer = getattr(pointer, 'weight')
            else:
                print("--> ", str(l))  ############### printed this
                print("==> ", str(pointer)) ################# printed this
                pointer = getattr(pointer, l[0])

output:

--> ['squad']
==> BertPreTrainingHeads(
  (predictions): BertLMPredictionHead(
    (transform): BertPredictionHeadTransform(
      (dense): Linear(in_features=768, out_features=768, bias=True)
      (LayerNorm): BertLayerNorm()
    )
    (decoder): Linear(in_features=768, out_features=30522, bias=False)
  )
  (seq_relationship): Linear(in_features=768, out_features=2, bias=True)
)
  • Can you please tell us what is happening? Does tensorflow add something during finetuning? Not sure from where squad word got into tensorflow ckpt file.
  • And, what needs to be done to fix this?
  • Are you planning to fix this and release updated code?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:14 (4 by maintainers)

github_iconTop GitHub Comments

9reactions
Hya-cinthuscommented, Nov 20, 2019

A possible solution if you’re copying a SQuAD-fine-tuned Bert from TF to PT

Issue: AttributeError: 'BertPreTrainingHeads' object has no attribute 'classifier'

It works for me by doing the following steps:

Step 1. In the script convert_tf_checkpoint_to_pytorch.py (or convert_bert_original_tf_checkpoint_to_pytorch.py):

  • Replace all BertForPreTraining with BertForQuestionAnswering.

Step 2. Open the source code file modeling_bert.py in your package site-packages\transformers:

  • In the function load_tf_weights_in_bert, replace elif l[0] == 'squad': pointer = getattr(pointer, 'classifier') with elif l[0] == 'squad': pointer = getattr(pointer, 'qa_outputs')

It should work since qa_outputs is the attribute name for the output layer of BertForQuestionAnswering instead of classifier.

Step 3. After copying, check your pytorch model by evaluating the dev-v2.0.json with a script like this: python run_squad.py --model_type bert --model_name_or_path MODEL_PATH --do_eval --train_file None --predict_file dev-v2.0.json --max_seq_length 384 --doc_stride 128 --output_dir ./output/ --version_2_with_negative where output_dir should contain a copy of the pytorch model.

This will result in an evaluation like this: { "exact": 72.99755748336563, "f1": 76.24686988414918, "total": 11873, "HasAns_exact": 72.82388663967612, "HasAns_f1": 79.33182964482165, "HasAns_total": 5928, "NoAns_exact": 73.17073170731707, "NoAns_f1": 73.17073170731707, "NoAns_total": 5945, "best_exact": 74.3619978101575, "best_exact_thresh": -3.6369030475616455, "best_f1": 77.12234803941384, "best_f1_thresh": -3.6369030475616455 } for a BERT-Base model.

However, if using BertForTokenClassification instead, the model will not be correctly copied since the structures for the classification layer are different. I tried this and got a model that had a f1 score of 10%.

3reactions
thomwolfcommented, Apr 3, 2019

Hi @SandeepBhutani, I pushed a commit to master which should help you do this kind of thing.

First, switch to master by cloning the repo and then follow the following instructions:

The convert_tf_checkpoint_to_pytorch conversion script is made to create BertForPretraining model which is not your use case but you can load another type of model by reproducing the behavior of this script as follows:

from pytorch_pretrained_bert import BertConfig, BertForTokenClassification, load_tf_weights_in_bert

# Initialise a configuration according to your model
config = BertConfig.from_pretrained('bert-XXX-XXX')

# You will need to load a BertForTokenClassification model
model = BertForTokenClassification(config)

# Load weights from tf checkpoint
load_tf_weights_in_bert(model, tf_checkpoint_path)

# Save pytorch-model
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)
Read more comments on GitHub >

github_iconTop Results From Across the Web

BertModel transformers outputs string instead of tensor
I'm following this tutorial that codes a sentiment analysis classifier using BERT with the huggingface library and I'm ...
Read more >
What to do when you get an error - Hugging Face Course
Once you've logged in, you can copy the template repository with the ... Here, reading the error message tells us that 'list' object...
Read more >
Text Extraction with BERT - Keras
Preprocess the data. Go through the JSON file and store every record as a SquadExample object. Go through each SquadExample and create x_train ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found