Training Classification Model + tokenizer from Scratch
See original GitHub issueDescribe the bug
I’ve been looking through the repository and observed the ClassificationModel
requires a pre-trained model as part of the args. Is there any chance we can take a model from scratch, train a tokenizer for our dataset and train the model on classification? Any help is really appreciated, I’ve really enjoyed using the simple transformers library for my research!
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
How to train a new language model from scratch using ...
Train a tokenizer Let's arbitrarily pick its size to be 52,000. We recommend training a byte-level BPE (rather than let's say, a WordPiece...
Read more >Transformers From Scratch: Training a Tokenizer
How to train a transformer model from scratch. We learn where to get ... All you need to create a custom tokenizer using...
Read more >Build a RoBERTa Model from Scratch | by Yulia Nudelman
In this article, we will build a pre-trained transformer model FashionBERT using the Hugging Face models. Goal. The goal is to train a...
Read more >Training a token classification model with fast.ai - YouTube
Last week we covered how to use the tokenizer library to get our data to a state where we can train token classification...
Read more >Step 3: Prepare Your Data | Machine Learning
In the subsequent paragraphs, we will see how to do tokenization and vectorization for sequence models. We will also cover how we can...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I can answer this empirically - option 2 is superior. That is why transfer learning is so powerful.
The philosophical question is whether it’s worth the effort to fine-tune a pre-trained language model on the domain-specific text (i.e. fine-tune the language model itself) before training the model on the classification itself. In this case, I would suggest skipping the language model fine-tuning at first, and then coming back to it if the final results are not satisfactory.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.