RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
See original GitHub issueHi, I was using run_classification.py with the base model of layoutlm for my own document set. However, I came across this error:
04/25/2020 09:26:55 - WARNING - __main__ - Process rank: -1, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: False
04/25/2020 09:26:55 - INFO - transformers.configuration_utils - loading configuration file /home/ubuntu/rwik_xx_document_classification/unilm/layoutlm-base-uncased/config.json
04/25/2020 09:26:55 - INFO - transformers.configuration_utils - Model config {
"attention_probs_dropout_prob": 0.1,
"finetuning_task": "cdip",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"is_decoder": false,
"layer_norm_eps": 1e-12,
"max_2d_position_embeddings": 1024,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_labels": 2,
"output_attentions": false,
"output_hidden_states": false,
"output_past": true,
"pruned_heads": {},
"torchscript": false,
"type_vocab_size": 2,
"use_bfloat16": false,
"vocab_size": 30522
}
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - Model name '/home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). Assuming '/home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/' is a path or url to a directory containing tokenizer files.
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - Didn't find file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/added_tokens.json. We won't load it.
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/vocab.txt
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - loading file None
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/special_tokens_map.json
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils - loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/tokenizer_config.json
04/25/2020 09:26:55 - INFO - transformers.modeling_utils - loading weights file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/pytorch_model.bin
04/25/2020 09:27:16 - INFO - transformers.modeling_utils - Weights of LayoutLMForSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
04/25/2020 09:27:16 - INFO - transformers.modeling_utils - Weights from pretrained model not used in LayoutLMForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
04/25/2020 09:27:19 - INFO - __main__ - Training/evaluation parameters Namespace(adam_epsilon=1e-08, cache_dir='', config_name='', data_dir='/home/ubuntu/rwik-xx/xx_pdftotext_html_output/', dev_folder='trial_24_04_test', device=device(type='cuda', index=0), do_eval=True, do_lower_case=True, do_train=True, eval_all_checkpoints=True, evaluate_during_training=True, fp16=False, fp16_opt_level='O1', gradient_accumulation_steps=2, hierarchical_tokens=False, learning_rate=5e-05, local_rank=-1, logging_steps=100, max_grad_norm=1.0, max_seq_length=448, max_steps=10000, model_name_or_path='/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm-base-uncased/', model_type='layoutlm', n_gpu=1, no_cuda=False, nuance_mode=True, num_train_epochs=4.0, output_dir='/home/ubuntu/rwik_xx_document_classification/weight_files/1_layout_lm_base_24_04/', output_mode='classification', overwrite_cache=False, overwrite_output_dir=False, per_gpu_eval_batch_size=16, per_gpu_train_batch_size=8, save_steps=100, seed=566, server_ip='', server_port='', stride_len=112, task_name='cdip', tokenizer_name='', tpu=False, tpu_ip_address='', tpu_name='', tqdm_notebook_mode=False, train_folder='trial_24_04_train', warmup_steps=500, weight_decay=0.0, xrt_tpu_config='')
04/25/2020 09:27:19 - INFO - __main__ - Creating features from dataset file at /home/ubuntu/rwik-xx/xx_pdftotext_html_output/
Gettting train examples: 100%|██████████| 789/789 [00:26<00:00, 29.34it/s]
0%| | 0/789 [00:00<?, ?it/s]04/25/2020 09:27:46 - INFO - utils_classification - *** Example ***
04/25/2020 09:27:46 - INFO - utils_classification - guid: train-1
04/25/2020 09:27:46 - INFO - utils_classification - input_ids: 101 9986 2271 23773 11255 8909 1024 5709... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification - bboxes: [0, 0, 0, 0] [0, 0, 0, 0] [55, 44, 109, 53] [55, 44, 109, 53] [55, 44, 109, 53] [112, 44, 164, 53] ......[1000, 1000, 1000, 1000] ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification - attention_mask: 1 1 1 1 1 1 1 1 1 1 1 ... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification - token_type_ids: 0 0 0 0 0 0 0 0 0 0 0 ... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification - label: xxx (id = 1)
04/25/2020 09:27:46 - INFO - utils_classification - *** Example ***
.
.
.
(truncated)
100%|██████████| 789/789 [00:30<00:00, 26.17it/s]
04/25/2020 09:28:16 - INFO - __main__ - Saving features into cached file /home/ubuntu/rwik_xx/xx_pdftotext_html_output/cached_train_layoutlm-base-uncased_448_cdip
04/25/2020 09:28:19 - INFO - __main__ - ***** Running training *****
04/25/2020 09:28:19 - INFO - __main__ - Num examples = 1929
04/25/2020 09:28:19 - INFO - __main__ - Num Epochs = 83
04/25/2020 09:28:19 - INFO - __main__ - Instantaneous batch size per GPU = 8
04/25/2020 09:28:19 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 16
04/25/2020 09:28:19 - INFO - __main__ - Gradient Accumulation steps = 2
04/25/2020 09:28:19 - INFO - __main__ - Total optimization steps = 10000
Epoch: 0%| | 0/83 [00:00<?, ?it/s]
Iteration: 0%| | 0/242 [00:00<?, ?it/s]
Epoch: 0%| | 0/83 [00:00<?, ?it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in <module>()
925
926 if __name__ == "__main__":
--> 927 main()
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in main()
859 args, args.task_name, tokenizer, evaluate=False
860 )
--> 861 global_step, tr_loss = train(args, train_dataset, model, tokenizer)
862 logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)
863
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in train(args, train_dataset, model, tokenizer)
217 ) # XLM, DistilBERT and RoBERTa don't use segment_ids
218 #pdb.set_trace()
--> 219 outputs = model(**inputs)
220 loss = outputs[
221 0
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/modeling_layoutlm.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels)
328 token_type_ids=token_type_ids,
329 position_ids=position_ids,
--> 330 head_mask=head_mask,
331 )
332
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/modeling_layoutlm.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask)
191 )
192 encoder_outputs = self.encoder(
--> 193 embedding_output, extended_attention_mask, head_mask=head_mask
194 )
195 sequence_output = encoder_outputs[0]
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
378 all_hidden_states = all_hidden_states + (hidden_states,)
379
--> 380 layer_outputs = layer_module(hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask)
381 hidden_states = layer_outputs[0]
382
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
349
350 def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 351 self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
352 attention_output = self_attention_outputs[0]
353 outputs = self_attention_outputs[1:] # add self attentions if we output attention weights
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
303
304 def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 305 self_outputs = self.self(hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
306 attention_output = self.output(self_outputs[0], hidden_states)
307 outputs = (attention_output,) + self_outputs[1:] # add attentions if we output them
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
213
214 def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 215 mixed_query_layer = self.query(hidden_states)
216
217 # If this is instantiated as a cross-attention module, the keys
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/linear.py in forward(self, input)
85
86 def forward(self, input):
---> 87 return F.linear(input, self.weight, self.bias)
88
89 def extra_repr(self):
/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py in linear(input, weight, bias)
1610 ret = torch.addmm(bias, input, weight.t())
1611 else:
-> 1612 output = input.matmul(weight.t())
1613 if bias is not None:
1614 output += bias
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
I am using torch version 1.3.0 and transformers version version 2.2.1 and Python 3.
Note that I have modified the loading process to suit the dataset format that my file is present in and I have verified that it is being loaded correctly. In addition to that, I tried running roberta-base through the same dataset and it worked.
When I tried to run python debugger and check the value of the input variable, I got this error
> /home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py(1612)linear()
1610 ret = torch.addmm(bias, input, weight.t())
1611 else:
-> 1612 output = input.matmul(weight.t())
1613 if bias is not None:
1614 output += bias
ipdb> input
*** RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278
ipdb> weight
*** RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (1 by maintainers)
Top Results From Across the Web
CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...
I was facing CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle) on colab Updating the pytorch to 1.8.1 fixed the ...
Read more >CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...
RuntimeError : CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` ... I'm using BertForSequenceClassifcation by huggingface ...
Read more >CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...
Dear all, It seems that torchbearer does not want to work for me. I am trying to simply classify images using resnet.
Read more >cuda error: cublas_status_internal_error when calling ` ...
CUDA error : CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` nlp ...
Read more >如何解决RuntimeError: CUDA error ...
如何解决RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` · 运行环境没有改变 · 之前代码可以正常运行 · 代码 ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
And the coordinates in the test_box.txt are all below 1000?
It’s expecting the bounding-box of all the words on the documents. If you go trough the seq_labeling example you’ll see how it works and it helps a lot.