Sharing Microsoft's DialogRPT (new dialog ranking model)
See original GitHub issue🌟 New model addition
Model description
Thanks for the awesome work!
DialogRPT (Dialog Ranking Pretrained Transformers) is a set of GPT-2 based dialogue ranking models recently released with an EMNLP paper by Microsoft Research. It’s a follow-up work of DialoGPT (thanks for hosting it!)
The architecture is pretty simple: a GPT2Model
followed by a torch.nn.Linear(n_embd, 1, bias=False)
, and implemented based on a previous HuggingFace commit
At first, I’m trying to create a model card for it, but then realized that it seems there’s no existing model architecture in HuggingFace is compatible with DialogRPT. I noticed a lot of BERT-based sequence classification models, but ours is GPT-2 based.
If there’s a simple fix (or I missed something) please let me know! If implementation in modeling_gpt2.py is necessary, I’m also glad to help!
Open source status
- the model implementation is available: (https://github.com/golsun/DialogRPT)
- the model weights are available: (https://github.com/golsun/DialogRPT)
- who are the authors: @golsun @dreasysnail
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:12 (7 by maintainers)
Hi @golsun!
GPT2ForSequenceClassification
has been implemented on #7501 and I verified that I obtain the same results as you do on your README using your examples.You should only need to upload your models on the model hub now! Some helpers regarding the configuration:
gpt2-medium
configuration that you can find here.num_labels=1
field to these configurations.architectures
field, you should putGPT2ForSequenceClassification
thank you @LysandreJik
AutoModelForSequenceClassification
works now. The inference webpage still gives theUnrecognized configuration class
error but I guess it will sync with the latest code soon. I’m going to introduce model card in the original repo. Thanks again for the help!