Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Loading a TF pretrained model into BertForSequenceClassification module

See original GitHub issue

Hi, there might be something I am doing wrong, but I cannot figure out what that is, then any help would be welcome.

After downloading a TF checkpoint (containing model.index, model.data, model.meta, config.json and vocab.txt files), I used it to perform pretraining using some more text, more relevant to the task I would have ahead. Pretraining was performed using the API from BERT’s official github. This generated other model.index, model.data and model.meta files. I am now trying to load them into the BertForSequenceClassification module, using the from_pretrained method. I figured I should include a config instance as well, so I used that config.json file that was attached to that very first one TF checkpoint I mentioned. But when passing the index file to from_pretrained, I get the error:

AttributeError: 'BertForSequenceClassification' object has no attribute 'bias'

Any help would be much appreciated.

Issue Analytics

State:
Created 3 years ago
Comments:5 (4 by maintainers)

Top GitHub Comments

1reaction

LysandreJikcommented, Apr 23, 2020

Hi! You can’t directly load an official TensorFlow checkpoint in the PyTorch model, you first need to convert it. You can use this script to convert it to a PyTorch checkpoint.

If you want to use our TensorFlow interface in order to do this, you would still need to use this script to convert it to our interface, and then use the TFBertForSequenceClassification while specifying the from_pt option (as the result would be a PyTorch checkpoint):

model = TFBertForSequenceClassification.from_pretrained("directory", from_pt=True)

0reactions

LysandreJikcommented, Apr 30, 2021

You could use the script you mention to convert the model to PyTorch; then this PyTorch checkpoint can be seamlessly loaded in a TensorFlow implementation, see comment above: https://github.com/huggingface/transformers/issues/3931#issuecomment-618629373

After that you can just do

model.save_pretrained("here")

and you should have a tf_model.h5 under here.

Top Results From Across the Web

Training and fine-tuning — transformers 3.3.0 documentation

Model classes in Transformers that don't begin with TF are PyTorch Modules, meaning that ... from transformers import BertForSequenceClassification model ...

Fine-tuning a BERT model | Text - TensorFlow

This tutorial demonstrates how to fine-tune a Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) model ...

PyTorch Pretrained Bert - Model Zoo

This implementation is provided with Google's pre-trained models, examples, notebooks and a command-line interface to load any pre-trained TensorFlow ...

PyTorch-Transformers

The library currently contains PyTorch implementations, pre-trained model ... loading assert model.config.output_attentions == True # Loading from a TF ...

Cannot load BERT from local disk - Stack Overflow

As it was already pointed in the comments - your from_pretrained param should be either id of a model hosted on huggingface.co or...