Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to modify the model config?

See original GitHub issue

Well I am trying to generate embedding for a large sentence. I get this error

Traceback (most recent call last): all_encoder_layers, _ = model(input_ids, token_type_ids=None, attention_mask=input_mask) File “/Users/venv/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 477, in call result = self.forward(*input, **kwargs) File “/Users/venv/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py”, line 611, in forward embedding_output = self.embeddings(input_ids, token_type_ids) File “/Users/venv/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 477, in call result = self.forward(*input, **kwargs) File “/Users/venv/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py”, line 196, in forward position_embeddings = self.position_embeddings(position_ids) File “/Users/venv/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 477, in call result = self.forward(*input, **kwargs) File “/Users/venv/lib/python3.6/site-packages/torch/nn/modules/sparse.py”, line 110, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File “/Users/venv/lib/python3.6/site-packages/torch/nn/functional.py”, line 1110, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: index out of range at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.cpp:352

I find that max_position_embeddings (default size 512) is getting exceeded. Which is taken from the config that is downloaded as part of the initial step. Initially the download was done to the default location PYTORCH_PRETRAINED_BERT_CACHE where I was not able to find the config.json other than the model file and vocab.txt (named with random characters). I did it to a specific location in local with the cache_dir param, here also I was facing the same problem of finding the bert_config.json.

Also I found a file in both the default cache and local cache, named with junk characters of JSON type. When I tried opening it, I could just see this

{“url”: “https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz”, “etag”: “"61343686707ed78320e9e7f406946db2-49"”}

Any help to modify the config.json would be appreciated.

Or if this is been caused for a different reason, Please let me know.

Issue Analytics

State:
Created 5 years ago
Reactions:2
Comments:6 (1 by maintainers)

Top GitHub Comments

6reactions

thomwolfcommented, Dec 9, 2018

Indeed, it doesn’t make sense to go over 512 tokens for a pre-trained model.

If you have longer text, you should try the sliding window approach detailed on the original Bert repo: https://github.com/google-research/bert/issues/66

1reaction

bheinzerlingcommented, Dec 9, 2018

It does not make sense to customize options when using pretrained models, it only makes sense when training your own model from scratch.

You cannot use the pretrained models with another max_position_embeddings than 512, because the pretrained models contain pretrained embeddings for 512 positions. The original transformer paper introduced a positional encoding which allows extrapolation to arbitrary input lengths, but this was not used in BERT.

You can override max_position_embeddings, but this won’t have any effect. The model will probably run fine for shorter inputs, but you will get a RuntimeError: cuda runtime error (59) for an input longer than 512 word pieces, because the embedding lookup here will attempt to use an index that is too large.