Loading a TF pretrained model into BertForSequenceClassification module
See original GitHub issueHi, there might be something I am doing wrong, but I cannot figure out what that is, then any help would be welcome.
After downloading a TF checkpoint (containing model.index, model.data, model.meta, config.json and vocab.txt files), I used it to perform pretraining using some more text, more relevant to the task I would have ahead. Pretraining was performed using the API from BERT’s official github. This generated other model.index, model.data and model.meta files. I am now trying to load them into the BertForSequenceClassification module, using the from_pretrained
method. I figured I should include a config instance as well, so I used that config.json file that was attached to that very first one TF checkpoint I mentioned. But when passing the index file to from_pretrained
, I get the error:
AttributeError: 'BertForSequenceClassification' object has no attribute 'bias'
Any help would be much appreciated.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Hi! You can’t directly load an official TensorFlow checkpoint in the PyTorch model, you first need to convert it. You can use this script to convert it to a PyTorch checkpoint.
If you want to use our TensorFlow interface in order to do this, you would still need to use this script to convert it to our interface, and then use the
TFBertForSequenceClassification
while specifying thefrom_pt
option (as the result would be a PyTorch checkpoint):You could use the script you mention to convert the model to PyTorch; then this PyTorch checkpoint can be seamlessly loaded in a TensorFlow implementation, see comment above: https://github.com/huggingface/transformers/issues/3931#issuecomment-618629373
After that you can just do
and you should have a
tf_model.h5
underhere
.