Adding special tokens to the model
See original GitHub issueHello,
I am trying to use model.tokenizer.add_special_tokens(special_tokens_dict)
to add some special tokens to the model. But after doing that i received indexing error (IndexError: index out of range in self )
when i wanted to encode a sentence. I wonder to know how i can learn the vector representations of new tokens? something like model.resize_token_embeddings(len(t))
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
How to add some new special tokens to a pretrained tokenizer?
Hi guys. I want to add some new special tokens like [XXX] to a pretrained ByteLevelBPETokenizer, but I can't find how to do...
Read more >Utilities for Tokenizers - Hugging Face
The model input with special tokens. Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating...
Read more >How to add new special token to the tokenizer? - Stack Overflow
I want to build a multi-class classification model for which I have conversational data as input for the BERT model ...
Read more >How to add new tokens to huggingface transformers vocabulary
In this short article, you'll learn how to add new tokens to the vocabulary of a huggingface transformer model.
Read more >Adding a new token to a transformer model without breaking ...
add_tokens (new_words) model.resize_token_embeddings(len(tokenizer)) tokenizer.tokenize('myword1 myword2') # result: ['myword1', 'myword2 ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
You can use this code:
Yes, it is correct.