question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use electra with `from_pretrained` in transformers library

See original GitHub issue

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I’m always frustrated when […] We trained ElectraForSequenceClassification, but we tried to use this pertained model with transformers ElectraForSequenceClassification using .from_pretrained method which provided us with following warnings

Some weights of the model checkpoint at models/electra-base-generator-final were not used when initializing ElectraForSequenceClassification: ['pooler.dense.weight', 'pooler.dense.bias', 'classifier.weight', 'classifier.bias']
- This IS expected if you are initializing ElectraForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing ElectraForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of ElectraForSequenceClassification were not initialized from the model checkpoint at models/electra-base-generator-final and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Describe the solution you’d like is there any way to convert this model without running training again?

Describe alternatives you’ve considered A clear and concise description of any alternative solutions or features you’ve considered. Can you provide some script or some hints on how it can be implemented?

Additional context Add any other context or screenshots about the feature request here.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
WeberJuliancommented, Oct 12, 2020

Hi is there some news on this topic ? I trained a model with simpletransformers but my inference code is with the transformers library.

1reaction
ThilinaRajapaksecommented, Jul 14, 2020

You should be able to use it without retraining the model.

The warning is issued because the model weights are initialized directly through Pytorch instead of through the from_pretrained() method.

Read more comments on GitHub >

github_iconTop Results From Across the Web

EleutherAI/gpt-j-6B - Hugging Face
GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents ......
Read more >
Deploy GPT-J 6B for inference using Hugging Face ...
Learn how to deploy EleutherAIs GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker.
Read more >
Load a pre-trained model from disk with Huggingface ...
Assuming your pre-trained (pytorch based) transformer model is in 'model' folder in your current working directory, following code can load your ...
Read more >
A Deep Dive Into Transformers Library - Analytics Vidhya
Here, we will deep dive into the Transformers library and explore how to use available pre-trained models and tokenizers from ModelHub.
Read more >
Pretrain Transformers Models in PyTorch Using Hugging Face ...
Use an already pretrained transformers model and fine-tune (continue training) it on your custom dataset. · transformers library needs to be ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found