question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Queries about the Notation and Model training of T5 and ELECTRA sentiment classification.

See original GitHub issue

I have a few questions about the model notation. And also short info about T5 and ELECTRA. I would like to make separate issues but things are not too complex. I mainly working on CV, so sorry if I being so silly.

1 Cased or Uncased

What is mean by cased and uncased?

bert-base-uncased
bert-base-cased

2 Suffix

I was trying to run the XLM model but in the pre-train model, I’ve found the following weights, I understood about XML-MLM but couldn’t get the rest of the part, ex: enfr-1024, enro-1024 etc.

xlm-mlm-enfr-1024
xlm-mlm-enro-1024
xlm-mlm-tlm-xnli15-1024

3 Sentiment Analysis using T5 and ELECTRA

Is it possible to use these two models for sentiment classification, simply just a binary classification? How can we implement these two transformers? I have a high-level overview of T5, it transforms both (input/target) as a text. I found it useful though but bit trouble to implement. Using transformers, is it possible to go within a convenient way?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

6reactions
craffelcommented, Apr 8, 2020

Hi, it is easy to use the pre-trained T5 models for sentiment ID. You could do something like

MODEL_NAME = "t5-base"
model = transformers.T5ForConditionalGeneration.from_pretrained(MODEL_NAME)
tokenizer = transformers.AutoTokenizer.from_pretrained(MODEL_NAME)
input_text = "sst2 sentence: This movie was great! I loved the acting."
inputs = tokenizer.encode_plus(input_text, return_token_type_ids=False, return_tensors="pt")
print(tokenizer.decode(model.generate(**inputs)[0]))
input_text = "sst2 sentence: The acting was so bad in this movie I left immediately."
inputs = tokenizer.encode_plus(input_text, return_token_type_ids=False, return_tensors="pt")
print(tokenizer.decode(model.generate(**inputs)[0]))

The "sst2 sentence:" prefix is what we used for the SST-2 task. It is a sentiment ID task. The model needs to see this prefix to know what task you want it to undertake.

4reactions
craffelcommented, Apr 9, 2020

I sure I’m missing something crucial part that is not considering text-to-text manner. If I convert 1 and 0 of labels as Positive and Negative…I mean shouldn’t the target need to be numeric!

No, the target should always be text for T5. You should map your 0/1 labels to the words “negative” and “positive” and fine-tune T5 to predict those words, and then map them back to 0/1 after the model outputs the text if needed. This is the point of the text-to-text framework - all tasks take text as input and produce text as output. So, for example, your “build model” code should not include a dense layer with a sigmoid output, etc. There is no modification to the model structure necessary whatsoever.

And about the prefix, sst2 sentence: so, this is, in other words, is a string indicator to inform the model about the goal or task. So, do I have to add this string at the beginning of every text sentence or (samples)?

Yes, that is the intention.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Exploring Transfer Learning with T5: the Text-To-Text Transfer ...
Transfer learning's effectiveness comes from pre-training a model on abundantly-available unlabeled text data with a self-supervised task, ...
Read more >
Asking the Right Questions: Training a T5 Transformer Model ...
The T5 model is trained on a wide variety of NLP tasks including text classification, question answering, machine translation, and abstractive ...
Read more >
T5 — transformers 2.9.1 documentation - Hugging Face
T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into...
Read more >
Pre-trained Token-replaced Detection Model as Few-shot ...
trained token-replaced detection models like ELECTRA. ... Different approaches of applying pre-trained models to sentiment classification.
Read more >
An Effective ELECTRA-Based Pipeline for Sentiment Analysis ...
Finally, we compare our pipeline model with several representative deep text classification models. Extensive experiments have demonstrated ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found