Implement PyTorch and/or TensorFlow sequence classification architectures for causal language models
See original GitHub issue🚀 Feature request
The architecture GPT2ForSequenceClassification
was added in #7501 in PyTorch. It would be great to have it in TensorFlow (cf. issues #7622), but it would also be great to have it for other causal models: ~OpenAI GPT~, ~CTRL~ (PR opened @elk-cloner), ~TransfoXL~ (PR opened @spatil6)
Below is a list of items to follow to make sure the integration of such an architecture is complete:
- Implement
XXXForSequenceClassification
inmodeling_xxx.py
orTFXXXForSequenceClassification
in `modeling_tf_xxx.py - Test that architecture in
tests/test_modeling_xxx.py
ortests/test_modeling_tf_xxx.py
- Add that architecture to
__init__.py
anddocs/source/model_doc/xxx.rst
.
Taking a look at the code changes in #7501 would be a good start.
A very good first issue to get acquainted with the library and its architectures!
Issue Analytics
- State:
- Created 3 years ago
- Comments:18 (10 by maintainers)
Top Results From Across the Web
Text classification with an RNN - TensorFlow
This text classification tutorial trains a recurrent neural network on the IMDB large movie review dataset for sentiment analysis.
Read more >Build Your First Text Classification model using PyTorch
In this article learn how to solve text classification problems and build text classification models and implementation of text ...
Read more >NLP From Scratch: Translation with a Sequence to ... - PyTorch
With a seq2seq model the encoder creates a single vector which, in the ideal case, encodes the “meaning” of the input sequence into...
Read more >Deep Learning Models - GitHub
A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks. Traditional Machine Learning. Title ...
Read more >PyTorch vs TensorFlow — spotting the difference
Here we introduce datasets module which contains wrappers for popular datasets used to benchmark deep learning architectures. Also nn.Module is used to build...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Ok thanks @LysandreJik.
I’m waiting for this PR #8714 to get merge. Once done, I’ll raise PR for these models as well.
I believe CTRL and TransfoXL are still available. Feel free to open a PR!