Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Wrong argument passed during TFRobertaClassificationHead initialization

See original GitHub issue

🐛 Bug

Information

There is an issue preventing a RoBERTa classification model from being serialized. It is related to a problem in passing config as the first argument to tf.keras.layers.Layer. However, the expected positional argument is trainable:

https://github.com/huggingface/transformers/blob/d6a677b14bcfd56b22fafeb212a27c6068886e07/src/transformers/modeling_tf_roberta.py#L327-L331

This is the root cause behind issue #3664 (about serialization). A related fix for GPT2: #2738.

Model I am using (Bert, XLNet …): RoBERTa

Language I am using the model on (English, Chinese …): English The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Run the code below:

from transformers import (TFRobertaForSequenceClassification)
base_model = TFRobertaForSequenceClassification.from_pretrained("roberta-base")
print(base_model.classifier.trainable)

Expected behavior

The output is: True

The current output is

RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

Environment info

transformers version: 2.10.0
Platform: Colab
Python version: 3.6.9
PyTorch version (GPU?):
Tensorflow version (GPU?): 2.2.0
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Issue Analytics

State:
Created 3 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

harkouscommented, Jun 9, 2020

Thanks for making the checks. I submitted a PR: https://github.com/huggingface/transformers/pull/4884

1reaction

jplucommented, Jun 9, 2020

Ok, thanks for the feedback. Indeed, the config parameter is important. I will take some time to review this. Sorry for the inconvenience.

Top Results From Across the Web

RoBERTa - Hugging Face

It is used to instantiate a RoBERTa model according to the specified arguments, defining the model architecture. Instantiating a configuration with the ...

Using Roberta classification head for fine-tuning a pre-trained ...

An example to show how we can use Huggingface Roberta Model for fine-tuning a classification task starting from a pre-trained model.

Finetuning Transformers with JAX + Haiku

Today we'll be walking through a port of the RoBERTa pre-trained model to JAX + Haiku, then finetuning the model to solve a...

Transfer learning and fine-tuning | TensorFlow Core

We add a Dropout layer before the classification layer, for regularization. We make sure to pass training=False when calling the base model, so ......

BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick

In this tutorial I'll show you how to use BERT with the huggingface PyTorch library to quickly and efficiently fine-tune a model to...