question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Got `ONNXRuntimeError` when try to run BART in ONNX format

See original GitHub issue

Environment info

  • transformers version: 4.9.0
  • Platform: Linux-5.4.104±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.7.11
  • PyTorch version (GPU?): 1.9.0+cu102 (True)
  • Using GPU in script?: Yes

Who can help

@mfuntowicz

To reproduce

I was using Google Colab and trying to export model facebook/bart-large-cnn to the onnx format. I ran the command python -m transformers.onnx -m=facebook/bart-large-cnn onnx/bart-large-cnn, and the outputs seem okay.

2021-07-22 23:14:33.821472: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Using framework PyTorch: 1.9.0+cu102
Overriding 1 configuration item(s)
	- use_cache -> False
/usr/local/lib/python3.7/dist-packages/transformers/models/bart/modeling_bart.py:212: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/usr/local/lib/python3.7/dist-packages/transformers/models/bart/modeling_bart.py:218: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/usr/local/lib/python3.7/dist-packages/transformers/models/bart/modeling_bart.py:249: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/usr/local/lib/python3.7/dist-packages/transformers/models/bart/modeling_bart.py:863: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
tcmalloc: large alloc 1625399296 bytes == 0x5595ce83a000 @  0x7f1780d9f887 0x7f177f695c29 0x7f177f696afb 0x7f177f696bb4 0x7f177f696f9c 0x7f17670dcbb7 0x7f17670dd064 0x7f175b75ba1c 0x7f176bf8eaff 0x7f176b949b88 0x55949fda8bf8 0x55949fe1c6f2 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fe16c35 0x55949fda973a 0x55949fe1bf40 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fda965a 0x55949fe17b0e 0x55949fda965a 0x55949fe17b0e 0x55949fe16c35 0x55949fe16933 0x55949fe14da0 0x55949fda7ea9 0x55949fda7da0 0x55949fe1bbb3
tcmalloc: large alloc 1625399296 bytes == 0x55962f654000 @  0x7f1780d9f887 0x7f177f695c29 0x7f177f696afb 0x7f177f696bb4 0x7f177f696f9c 0x7f17670dcbb7 0x7f17670dd064 0x7f175b75ba1c 0x7f176bf8ecab 0x7f176b949b88 0x55949fda8bf8 0x55949fe1c6f2 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fe16c35 0x55949fda973a 0x55949fe1bf40 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fda965a 0x55949fe17b0e 0x55949fda965a 0x55949fe17b0e 0x55949fe16c35 0x55949fe16933 0x55949fe14da0 0x55949fda7ea9 0x55949fda7da0 0x55949fe1bbb3
tcmalloc: large alloc 1625399296 bytes == 0x5595ce83a000 @  0x7f1780d9d1e7 0x55949fdd9a18 0x55949fda4987 0x7f176bf8ece2 0x7f176b949b88 0x55949fda8bf8 0x55949fe1c6f2 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fe16c35 0x55949fda973a 0x55949fe1bf40 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fda965a 0x55949fe17b0e 0x55949fda965a 0x55949fe17b0e 0x55949fe16c35 0x55949fe16933 0x55949fe14da0 0x55949fda7ea9 0x55949fda7da0 0x55949fe1bbb3 0x55949fe16c35 0x55949fda973a 0x55949fe17b0e 0x55949fe16c35 0x55949fce8eb1
tcmalloc: large alloc 1625399296 bytes == 0x55962f654000 @  0x7f1780d9f887 0x7f177f695c29 0x7f177f695d47 0x7f177f6977a5 0x7f176bd60368 0x7f176bfbc844 0x7f176b949b88 0x55949fda8010 0x55949fda7da0 0x55949fe1bbb3 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fe16c35 0x55949fda973a 0x55949fe1bf40 0x55949fe16c35 0x55949fda973a 0x55949fe1893b 0x55949fda965a 0x55949fe17b0e 0x55949fda965a 0x55949fe17b0e 0x55949fe16c35 0x55949fe16933 0x55949fe14da0 0x55949fda7ea9 0x55949fda7da0 0x55949fe1bbb3 0x55949fe16c35 0x55949fda973a
Validating ONNX model...
	-[✓] ONNX model outputs' name match reference model ({'last_hidden_state', 'encoder_last_hidden_state'}
	- Validating ONNX Model output "last_hidden_state":
		-[✓] (2, 8, 1024) matchs (2, 8, 1024)
		-[✓] all values close (atol: 0.0001)
	- Validating ONNX Model output "encoder_last_hidden_state":
		-[✓] (2, 8, 1024) matchs (2, 8, 1024)
		-[✓] all values close (atol: 0.0001)
All good, model saved at: onnx/bart-large-cnn/model.onnx

Then I tried to execute the model in onnxruntime,

import onnxruntime as ort

ort_session = ort.InferenceSession('onnx/bart-large-cnn/model.onnx')

# Got input_ids and attention_mask using tokenizer

outputs = ort_session.run(None, {'input_ids': input_ids.detach().cpu().numpy(), 'attention_mask': attention_mask.detach().cpu().numpy()})

And I got the error,

---------------------------------------------------------------------------
RuntimeException                          Traceback (most recent call last)
<ipython-input-30-380e6a0e1085> in <module>()
----> 1 outputs = ort_session.run(None, {'input_ids': input_ids.detach().cpu().numpy(), 'attention_mask': attention_mask.detach().cpu().numpy()})

/usr/local/lib/python3.7/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py in run(self, output_names, input_feed, run_options)
    186             output_names = [output.name for output in self._outputs_meta]
    187         try:
--> 188             return self._sess.run(output_names, input_feed, run_options)
    189         except C.EPFail as err:
    190             if self._enable_fallback:

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_109' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:42 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector<long int>&, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{2}, requested shape:{1,1}

I see that BART is recently supported for ONNX in the latest release, but there isn’t any code to exactly explain how to run the inference in onnxruntime. Maybe I’m doing something wrong here, so any help will be appreciated!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:3
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
ryangaweicommented, Sep 1, 2021

@LysandreJik Thank you for the follow-up. I’ll pay attention to any updates.

0reactions
Avi-avidancommented, May 16, 2022

fails to export facebook/bart-large-cnn or , following instructions on - https://github.com/huggingface/transformers/tree/main/examples/research_projects/onnx/summarization

(py39) user@Avis-MacBook-Pro-2 summarization % python run_onnx_exporter.py --model_name_or_path facebook/bart-large-cnn Traceback (most recent call last): File “~/src/transformers/examples/research_projects/onnx/summarization/run_onnx_exporter.py”, line 207, in <module> main() File “~/src/transformers/examples/research_projects/onnx/summarization/run_onnx_exporter.py”, line 184, in main model, tokenizer = load_model_tokenizer(args.model_name_or_path, device) File “~/src/transformers/examples/research_projects/onnx/summarization/run_onnx_exporter.py”, line 93, in load_model_tokenizer huggingface_model = model_dict[model_name].from_pretrained(model_name).to(device) KeyError: ‘facebook/bart-large-cnn’

same error when trying to export model lidiya/bart-base-samsum

any advice would be greatly appreciated. thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Got `ONNXRuntimeError` when try to run BART in ONNX ...
I was using Google Colab and trying to export model facebook/bart-large-cnn to the onnx format. I ran the command python -m ...
Read more >
pytorch - Bart model inference results after converting from ...
I have implemented fast-Bart. Which essentially converts Bart model from Pytorch to Onnx- with generate capabilities. fast-Bart.
Read more >
How to Convert a PyTorch Model to ONNX in 5 Minutes
Learn about ONNX and how to convert a ResNet-50 model to ONNX. Then, try optimizing it to reduce its latency and increase its...
Read more >
Different behaviour when extending this project to Bart
Hello there. This is a really fantastic project. I'm trying to extend your work to Bart but I've run into some strange behaviour....
Read more >
The fastt5 from Ki6an - GithubHelp
Python 100.00% python t5 onnx onnxruntime quantization fastt5 nlp fast ... I'm trying to extend your work to Bart but I've run into...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found