Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

See original GitHub issue

Hi, I was using run_classification.py with the base model of layoutlm for my own document set. However, I came across this error:

04/25/2020 09:26:55 - WARNING - __main__ -   Process rank: -1, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: False
04/25/2020 09:26:55 - INFO - transformers.configuration_utils -   loading configuration file /home/ubuntu/rwik_xx_document_classification/unilm/layoutlm-base-uncased/config.json
04/25/2020 09:26:55 - INFO - transformers.configuration_utils -   Model config {
  "attention_probs_dropout_prob": 0.1,
  "finetuning_task": "cdip",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "is_decoder": false,
  "layer_norm_eps": 1e-12,
  "max_2d_position_embeddings": 1024,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "num_labels": 2,
  "output_attentions": false,
  "output_hidden_states": false,
  "output_past": true,
  "pruned_heads": {},
  "torchscript": false,
  "type_vocab_size": 2,
  "use_bfloat16": false,
  "vocab_size": 30522
}

04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   Model name '/home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased). Assuming '/home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/' is a path or url to a directory containing tokenizer files.
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   Didn't find file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/added_tokens.json. We won't load it.
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/vocab.txt
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   loading file None
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/special_tokens_map.json
04/25/2020 09:26:55 - INFO - transformers.tokenization_utils -   loading file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/tokenizer_config.json
04/25/2020 09:26:55 - INFO - transformers.modeling_utils -   loading weights file /home/ubuntu/rwik-xx/unilm/layoutlm-base-uncased/pytorch_model.bin
04/25/2020 09:27:16 - INFO - transformers.modeling_utils -   Weights of LayoutLMForSequenceClassification not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
04/25/2020 09:27:16 - INFO - transformers.modeling_utils -   Weights from pretrained model not used in LayoutLMForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
04/25/2020 09:27:19 - INFO - __main__ -   Training/evaluation parameters Namespace(adam_epsilon=1e-08, cache_dir='', config_name='', data_dir='/home/ubuntu/rwik-xx/xx_pdftotext_html_output/', dev_folder='trial_24_04_test', device=device(type='cuda', index=0), do_eval=True, do_lower_case=True, do_train=True, eval_all_checkpoints=True, evaluate_during_training=True, fp16=False, fp16_opt_level='O1', gradient_accumulation_steps=2, hierarchical_tokens=False, learning_rate=5e-05, local_rank=-1, logging_steps=100, max_grad_norm=1.0, max_seq_length=448, max_steps=10000, model_name_or_path='/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm-base-uncased/', model_type='layoutlm', n_gpu=1, no_cuda=False, nuance_mode=True, num_train_epochs=4.0, output_dir='/home/ubuntu/rwik_xx_document_classification/weight_files/1_layout_lm_base_24_04/', output_mode='classification', overwrite_cache=False, overwrite_output_dir=False, per_gpu_eval_batch_size=16, per_gpu_train_batch_size=8, save_steps=100, seed=566, server_ip='', server_port='', stride_len=112, task_name='cdip', tokenizer_name='', tpu=False, tpu_ip_address='', tpu_name='', tqdm_notebook_mode=False, train_folder='trial_24_04_train', warmup_steps=500, weight_decay=0.0, xrt_tpu_config='')
04/25/2020 09:27:19 - INFO - __main__ -   Creating features from dataset file at /home/ubuntu/rwik-xx/xx_pdftotext_html_output/
Gettting train examples: 100%|██████████| 789/789 [00:26<00:00, 29.34it/s]
  0%|          | 0/789 [00:00<?, ?it/s]04/25/2020 09:27:46 - INFO - utils_classification -   *** Example ***
04/25/2020 09:27:46 - INFO - utils_classification -   guid: train-1
04/25/2020 09:27:46 - INFO - utils_classification -   input_ids: 101 9986 2271 23773 11255 8909 1024 5709... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification -   bboxes: [0, 0, 0, 0] [0, 0, 0, 0] [55, 44, 109, 53] [55, 44, 109, 53] [55, 44, 109, 53] [112, 44, 164, 53] ......[1000, 1000, 1000, 1000] ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification -   attention_mask: 1 1 1 1 1 1 1 1 1 1 1 ... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification -   token_type_ids: 0 0 0 0 0 0 0 0 0 0 0 ... ( truncated )
04/25/2020 09:27:46 - INFO - utils_classification -   label: xxx (id = 1)
04/25/2020 09:27:46 - INFO - utils_classification -   *** Example ***
.
.
.
(truncated)

100%|██████████| 789/789 [00:30<00:00, 26.17it/s]
04/25/2020 09:28:16 - INFO - __main__ -   Saving features into cached file /home/ubuntu/rwik_xx/xx_pdftotext_html_output/cached_train_layoutlm-base-uncased_448_cdip
04/25/2020 09:28:19 - INFO - __main__ -   ***** Running training *****
04/25/2020 09:28:19 - INFO - __main__ -     Num examples = 1929
04/25/2020 09:28:19 - INFO - __main__ -     Num Epochs = 83
04/25/2020 09:28:19 - INFO - __main__ -     Instantaneous batch size per GPU = 8
04/25/2020 09:28:19 - INFO - __main__ -     Total train batch size (w. parallel, distributed & accumulation) = 16
04/25/2020 09:28:19 - INFO - __main__ -     Gradient Accumulation steps = 2
04/25/2020 09:28:19 - INFO - __main__ -     Total optimization steps = 10000
Epoch:   0%|          | 0/83 [00:00<?, ?it/s]
Iteration:   0%|          | 0/242 [00:00<?, ?it/s]
Epoch:   0%|          | 0/83 [00:00<?, ?it/s]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in <module>()
    925 
    926 if __name__ == "__main__":
--> 927     main()

/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in main()
    859             args, args.task_name, tokenizer, evaluate=False
    860         )
--> 861         global_step, tr_loss = train(args, train_dataset, model, tokenizer)
    862         logger.info(" global_step = %s, average loss = %s", global_step, tr_loss)
    863 

/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/run_classification.py in train(args, train_dataset, model, tokenizer)
    217                 )  # XLM, DistilBERT and RoBERTa don't use segment_ids
    218             #pdb.set_trace()
--> 219             outputs = model(**inputs)
    220             loss = outputs[
    221                 0

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/modeling_layoutlm.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels)
    328             token_type_ids=token_type_ids,
    329             position_ids=position_ids,
--> 330             head_mask=head_mask,
    331         )
    332 

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/rwik_xx_document_classification/unilm/layoutlm/modeling_layoutlm.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask)
    191         )
    192         encoder_outputs = self.encoder(
--> 193             embedding_output, extended_attention_mask, head_mask=head_mask
    194         )
    195         sequence_output = encoder_outputs[0]

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
    378                 all_hidden_states = all_hidden_states + (hidden_states,)
    379 
--> 380             layer_outputs = layer_module(hidden_states, attention_mask, head_mask[i], encoder_hidden_states, encoder_attention_mask)
    381             hidden_states = layer_outputs[0]
    382 

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
    349 
    350     def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 351         self_attention_outputs = self.attention(hidden_states, attention_mask, head_mask)
    352         attention_output = self_attention_outputs[0]
    353         outputs = self_attention_outputs[1:]  # add self attentions if we output attention weights

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
    303 
    304     def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 305         self_outputs = self.self(hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
    306         attention_output = self.output(self_outputs[0], hidden_states)
    307         outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/.local/lib/python3.6/site-packages/transformers/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask)
    213 
    214     def forward(self, hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None):
--> 215         mixed_query_layer = self.query(hidden_states)
    216 
    217         # If this is instantiated as a cross-attention module, the keys

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/modules/linear.py in forward(self, input)
     85 
     86     def forward(self, input):
---> 87         return F.linear(input, self.weight, self.bias)
     88 
     89     def extra_repr(self):

/home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1610         ret = torch.addmm(bias, input, weight.t())
   1611     else:
-> 1612         output = input.matmul(weight.t())
   1613         if bias is not None:
   1614             output += bias

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

I am using torch version 1.3.0 and transformers version version 2.2.1 and Python 3.

Note that I have modified the loading process to suit the dataset format that my file is present in and I have verified that it is being loaded correctly. In addition to that, I tried running roberta-base through the same dataset and it worked.

When I tried to run python debugger and check the value of the input variable, I got this error

> /home/ubuntu/.local/lib/python3.6/site-packages/torch/nn/functional.py(1612)linear()
   1610         ret = torch.addmm(bias, input, weight.t())
   1611     else:
-> 1612         output = input.matmul(weight.t())
   1613         if bias is not None:
   1614             output += bias

ipdb> input
*** RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278
ipdb> weight
*** RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /pytorch/aten/src/THC/THCCachingHostAllocator.cpp:278

Issue Analytics

State:
Created 3 years ago
Comments:12 (1 by maintainers)

Top GitHub Comments

2reactions

r000bincommented, Sep 23, 2020

And the coordinates in the test_box.txt are all below 1000?

1reaction

r000bincommented, Sep 20, 2020

It’s expecting the bounding-box of all the words on the documents. If you go trough the seq_labeling example you’ll see how it works and it helps a lot.

Top Results From Across the Web

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...

I was facing CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle) on colab Updating the pytorch to 1.8.1 fixed the ...

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...

RuntimeError : CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` ... I'm using BertForSequenceClassifcation by huggingface ...

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when ...

Dear all, It seems that torchbearer does not want to work for me. I am trying to simply classify images using resnet.

cuda error: cublas_status_internal_error when calling ` ...

CUDA error : CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` nlp ...

如何解决RuntimeError: CUDA error ...

如何解决RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` · 运行环境没有改变 · 之前代码可以正常运行 · 代码 ...