Wrong argument passed during TFRobertaClassificationHead initialization
See original GitHub issue🐛 Bug
Information
There is an issue preventing a RoBERTa classification model from being serialized. It is related to a problem in passing config
as the first argument to tf.keras.layers.Layer
. However, the expected positional argument is trainable
:
This is the root cause behind issue #3664 (about serialization). A related fix for GPT2: #2738.
Model I am using (Bert, XLNet …): RoBERTa
Language I am using the model on (English, Chinese …): English The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
- Run the code below:
from transformers import (TFRobertaForSequenceClassification)
base_model = TFRobertaForSequenceClassification.from_pretrained("roberta-base")
print(base_model.classifier.trainable)
Expected behavior
The output is:
True
The current output is
RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
Environment info
transformers
version: 2.10.0- Platform: Colab
- Python version: 3.6.9
- PyTorch version (GPU?):
- Tensorflow version (GPU?): 2.2.0
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
RoBERTa - Hugging Face
It is used to instantiate a RoBERTa model according to the specified arguments, defining the model architecture. Instantiating a configuration with the ...
Read more >Using Roberta classification head for fine-tuning a pre-trained ...
An example to show how we can use Huggingface Roberta Model for fine-tuning a classification task starting from a pre-trained model.
Read more >Finetuning Transformers with JAX + Haiku
Today we'll be walking through a port of the RoBERTa pre-trained model to JAX + Haiku, then finetuning the model to solve a...
Read more >Transfer learning and fine-tuning | TensorFlow Core
We add a Dropout layer before the classification layer, for regularization. We make sure to pass training=False when calling the base model, so ......
Read more >BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick
In this tutorial I'll show you how to use BERT with the huggingface PyTorch library to quickly and efficiently fine-tune a model to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for making the checks. I submitted a PR: https://github.com/huggingface/transformers/pull/4884
Ok, thanks for the feedback. Indeed, the
config
parameter is important. I will take some time to review this. Sorry for the inconvenience.