resize_token_embeddings error for Transformer-XL
See original GitHub issueš Bug
Information
Model I am using : Transformer-XL
Language I am using the model on : English
The problem arises when using:
- my own modified scripts: a fine-tuning script for TransfoXLLMHeadModel
To reproduce
The following code aims to add two new tokens to the vocabulary, āwugā and āwugsā. After doing so to the tokenizer, we call resize_token_embeddings
with the model in order to update its input embeddings to have correct dimension to account for the new tokens.
import torch
from transformers import TransfoXLTokenizer, TransfoXLLMHeadModel
model = TransfoXLLMHeadModel.from_pretrained('transfo-xl-wt103')
tokenizer = TransfoXLTokenizer.from_pretrained('transfo-xl-wt103')
tokenizer.add_tokens(['wug', 'wugs'])
model.resize_token_embeddings(len(tokenizer))
Running the above gives the following error
Traceback (most recent call last):
File "bug.py", line 9, in <module>
model.resize_token_embeddings(len(tokenizer))
File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 198, in resize_token_embeddings
model_embeds = base_model._resize_token_embeddings(new_num_tokens)
File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 213, in _resize_token_embeddings
new_embeddings = self._get_resized_embeddings(old_embeddings, new_num_tokens)
File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 234, in _get_resized_embeddings
old_num_tokens, old_embedding_dim = old_embeddings.weight.size()
File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
type(self).__name__, name))
AttributeError: 'AdaptiveEmbedding' object has no attribute 'weight'
It seems that the function resize_token_embeddings()
does not currently account for the particulars of the input embeddings used for the TransformerXLLMHeadModel.
Expected behavior
We expect that resize_token_embeddings
should handle the appropriate updating of the embedding layers for the new vocabulary size, so that the model can be correctly used with the new tokens.
Thank you in advance
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:14 (13 by maintainers)
Hi @vsieplus ,
This is a known bug and sadly we donāt have a solution for this now. TransfoXLLMHead uses adaptive weight embeddings which makes it not very easy to implement this function. Should be implemented in the long run though - I will note it down. @thomwolf @LysandreJik
Thanks a lot @sgugger for answering here! As @sgugger mentioned, itād be great if you can add a
_resize_token_embeddings()
function toTransfoXLPreTrainedModel
.The solution looks great to me @vsieplus š
You could make it a bit more compact, but thatās a nitpick: