Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot export Donut models to ONNX

See original GitHub issue

System Info

transformers version: 4.24.0.dev0
Platform: macOS-10.16-x86_64-i386-64bit
Python version: 3.8.13
Huggingface_hub version: 0.10.0
PyTorch version (GPU?): 1.12.1 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

It seems that the default tolerance of 1e-5 in the ONNX configuration for vision-encoder-decoder models is too small for Donut checkpoints (currently seeing 5e-3 - 9e-3 is needed). As a result, many (all?) Donut checkpoints can’t be exported using the default values in the CLI.

Having said that, the relatively large discrepancy in the exported models suggests there is a deeper issue involved with tracing these models and it would be great to eliminate this potential source of error before increasing the default value for atol.

Steps to reproduce:

Pick one of the Donut checkpoints from the naver-clover-ix org on the Hub
Export the model using the ONNX CLI, e.g.

python -m transformers.onnx --model=naver-clova-ix/donut-base-finetuned-docvqa --feature=vision2seq-lm onnx/

The above gives the following error:

ValueError: Outputs values doesn't match between reference model and ONNX exported model: Got max absolute difference of: 0.0091094970703125 for [ -0.6990948 -49.217014    3.7758636 ...   3.2241364   2.7353969
 -51.43289  ] vs [ -0.6989002 -49.215897    3.7760048 ...   3.223978    2.7355423
 -51.433964 ]

Full stack trace

Framework not requested. Using torch to export to ONNX.
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 4.74k/4.74k [00:00<00:00, 791kB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 803M/803M [00:09<00:00, 81.2MB/s]
/Users/lewtun/miniconda3/envs/transformers/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:2895.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Downloading: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 363/363 [00:00<00:00, 85.0kB/s]
Using framework PyTorch: 1.12.1
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:230: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if num_channels != self.num_channels:
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:220: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if width % self.patch_size[1] != 0:
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:223: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if height % self.patch_size[0] != 0:
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:536: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if min(input_resolution) <= self.window_size:
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:136: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  batch_size, height // window_size, window_size, width // window_size, window_size, num_channels
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:148: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  windows = windows.view(-1, height // window_size, width // window_size, window_size, window_size, num_channels)
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:622: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  was_padded = pad_values[3] > 0 or pad_values[5] > 0
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:623: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if was_padded:
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:411: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  batch_size // mask_shape, mask_shape, self.num_attention_heads, dim, dim
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:682: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  height_downsampled, width_downsampled = (height + 1) // 2, (width + 1) // 2
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:266: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  should_pad = (height % 2 == 1) or (width % 2 == 1)
/Users/lewtun/git/hf/transformers/src/transformers/models/donut/modeling_donut_swin.py:267: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if should_pad:
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
Validating ONNX model...
        -[✓] ONNX model output names match reference model ({'last_hidden_state'})
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (3, 4800, 1024) matches (3, 4800, 1024)
                -[x] values not close enough (atol: 0.0001)
Traceback (most recent call last):
  File "/Users/lewtun/miniconda3/envs/transformers/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/lewtun/miniconda3/envs/transformers/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/lewtun/git/hf/transformers/src/transformers/onnx/__main__.py", line 180, in <module>
    main()
  File "/Users/lewtun/git/hf/transformers/src/transformers/onnx/__main__.py", line 107, in main
    validate_model_outputs(
  File "/Users/lewtun/git/hf/transformers/src/transformers/onnx/convert.py", line 455, in validate_model_outputs
    raise ValueError(
ValueError: Outputs values doesn't match between reference model and ONNX exported model: Got max absolute difference of: 0.0091094970703125 for [ -0.6990948 -49.217014    3.7758636 ...   3.2241364   2.7353969
 -51.43289  ] vs [ -0.6989002 -49.215897    3.7760048 ...   3.223978    2.7355423
 -51.433964 ]

Expected behavior

Donut checkpoints can be exported to ONNX using either a good default value for atol or changes to the modeling code enable much better agreement between the original / exported models

Issue Analytics

State:
Created a year ago
Comments:9 (3 by maintainers)

Top GitHub Comments

2reactions

BakingBrainscommented, Oct 31, 2022

python -m transformers.onnx --model=naver-clova-ix/donut-base-finetuned-cord-v2 --feature=vision2seq-lm scratch/onnx --atol 1e-2

with --atol 1e-2 it works, but value of atol is low.

I think it is better to convert the model separately:

Encoder
Decoder
Decoder with past value.

And pipeline it together.

1reaction

lewtuncommented, Oct 31, 2022

cc @mht-sharma would you mind taking a look at this? It might be related to some of the subtleties you noticed with Whisper and passing encoder outputs through the model vs using the getters