question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Wrong argument passed during TFRobertaClassificationHead initialization

See original GitHub issue

🐛 Bug

Information

There is an issue preventing a RoBERTa classification model from being serialized. It is related to a problem in passing config as the first argument to tf.keras.layers.Layer. However, the expected positional argument is trainable:

https://github.com/huggingface/transformers/blob/d6a677b14bcfd56b22fafeb212a27c6068886e07/src/transformers/modeling_tf_roberta.py#L327-L331

This is the root cause behind issue #3664 (about serialization). A related fix for GPT2: #2738.

Model I am using (Bert, XLNet …): RoBERTa

Language I am using the model on (English, Chinese …): English The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Run the code below:
from transformers import (TFRobertaForSequenceClassification)
base_model = TFRobertaForSequenceClassification.from_pretrained("roberta-base")
print(base_model.classifier.trainable)

Expected behavior

The output is: True

The current output is

RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

Environment info

  • transformers version: 2.10.0
  • Platform: Colab
  • Python version: 3.6.9
  • PyTorch version (GPU?):
  • Tensorflow version (GPU?): 2.2.0
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
harkouscommented, Jun 9, 2020

Thanks for making the checks. I submitted a PR: https://github.com/huggingface/transformers/pull/4884

1reaction
jplucommented, Jun 9, 2020

Ok, thanks for the feedback. Indeed, the config parameter is important. I will take some time to review this. Sorry for the inconvenience.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RoBERTa - Hugging Face
It is used to instantiate a RoBERTa model according to the specified arguments, defining the model architecture. Instantiating a configuration with the ...
Read more >
Using Roberta classification head for fine-tuning a pre-trained ...
An example to show how we can use Huggingface Roberta Model for fine-tuning a classification task starting from a pre-trained model.
Read more >
Finetuning Transformers with JAX + Haiku
Today we'll be walking through a port of the RoBERTa pre-trained model to JAX + Haiku, then finetuning the model to solve a...
Read more >
Transfer learning and fine-tuning | TensorFlow Core
We add a Dropout layer before the classification layer, for regularization. We make sure to pass training=False when calling the base model, so ......
Read more >
BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick
In this tutorial I'll show you how to use BERT with the huggingface PyTorch library to quickly and efficiently fine-tune a model to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found