Queries about the Notation and Model training of T5 and ELECTRA sentiment classification.
See original GitHub issueI have a few questions about the model notation. And also short info about T5 and ELECTRA. I would like to make separate issues but things are not too complex. I mainly working on CV, so sorry if I being so silly.
1 Cased or Uncased
What is mean by cased and uncased?
bert-base-uncased
bert-base-cased
2 Suffix
I was trying to run the XLM model but in the pre-train model, I’ve found the following weights, I understood about XML-MLM but couldn’t get the rest of the part, ex: enfr-1024, enro-1024
etc.
xlm-mlm-enfr-1024
xlm-mlm-enro-1024
xlm-mlm-tlm-xnli15-1024
3 Sentiment Analysis using T5 and ELECTRA
Is it possible to use these two models for sentiment classification, simply just a binary classification? How can we implement these two transformers? I have a high-level overview of T5, it transforms both (input/target) as a text. I found it useful though but bit trouble to implement. Using transformers, is it possible to go within a convenient way?
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (4 by maintainers)
Top GitHub Comments
Hi, it is easy to use the pre-trained T5 models for sentiment ID. You could do something like
The
"sst2 sentence:"
prefix is what we used for the SST-2 task. It is a sentiment ID task. The model needs to see this prefix to know what task you want it to undertake.No, the target should always be text for T5. You should map your 0/1 labels to the words “negative” and “positive” and fine-tune T5 to predict those words, and then map them back to 0/1 after the model outputs the text if needed. This is the point of the text-to-text framework - all tasks take text as input and produce text as output. So, for example, your “build model” code should not include a dense layer with a sigmoid output, etc. There is no modification to the model structure necessary whatsoever.
Yes, that is the intention.