GPT as a Language Model
See original GitHub issueI am interested to use GPT as Language Model to assign Language modeling score (Perplexity score) of a sentence. Here is what I am using
import math
from pytorch_pretrained_bert import OpenAIGPTTokenizer, OpenAIGPTModel, OpenAIGPTLMHeadModel
# Load pre-trained model (weights)
model = OpenAIGPTLMHeadModel.from_pretrained('openai-gpt')
model.eval()
# Load pre-trained model tokenizer (vocabulary)
tokenizer = OpenAIGPTTokenizer.from_pretrained('openai-gpt')
def score(sentence):
tokenize_input = tokenizer.tokenize(sentence)
tensor_input = torch.tensor([tokenizer.convert_tokens_to_ids(tokenize_input)])
loss=model(tensor_input, lm_labels=tensor_input)
return math.exp(loss)
a=['there is a book on the desk',
'there is a plane on the desk',
'there is a book in the desk']
print([score(i) for i in a])
21.31652459381952, 61.45907380241148, 26.24923942649312
Is it the right way to score a sentence ?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:29
- Comments:14 (7 by maintainers)
Top Results From Across the Web
GPT-3 - Wikipedia
Generative Pre-trained Transformer 3 (GPT-3; stylized GPT·3) is an autoregressive language model that uses deep learning to produce human-like text.
Read more >GPT-3: All you need to know about the AI language model
GPT -3 is based on the concepts of transformer and attention similar to GPT-2. It has been trained on a large and variety...
Read more >Better Language Models and Their Implications - OpenAI
GPT -2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to ......
Read more >Language Models: GPT and GPT-2 - Towards Data Science
GPT models are pre-trained over a corpus/dataset of unlabeled textual data using a language modeling objective. Put simply, this means that we ...
Read more >GPT models explained. Open AI's GPT-1,GPT-2,GPT-3 - Medium
Generative Pre-trained Transformer (GPT) models by OpenAI have taken natural language processing (NLP) community by storm by introducing ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I’m confused whether the right way to calculate the perplexity for GPT2 is what the OP has done or as per the documentation https://huggingface.co/transformers/perplexity.html? Or both are equivalent for some value of the stride?
Is this score normalized on sentence lenght? And if not, what do I need to change to normalize it?