Export LayoutLMv2 to onnx
See original GitHub issueI am trying to export LayoutLMv2 model to onnx but there is no support for that available in transformers library. I have tried to follow the method available for layoutLM but that is not working. Here is config class for LayoutLMv2
class LayoutLMv2OnnxConfig(OnnxConfig):
def __init__(
self,
config: PretrainedConfig,
task: str = "default",
patching_specs: List[PatchingSpec] = None,
):
super().__init__(config, task=task, patching_specs=patching_specs)
self.max_2d_positions = config.max_2d_position_embeddings - 1
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("image", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("token_type_ids", {0: "batch", 1: "sequence"}),
]
)
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
framework: Optional[TensorType] = None,
) -> Mapping[str, Any]:
"""
Generate inputs to provide to the ONNX exporter for the specific framework
Args:
tokenizer: The tokenizer associated with this model configuration
batch_size: The batch size (int) to export the model for (-1 means dynamic axis)
seq_length: The sequence length (int) to export the model for (-1 means dynamic axis)
is_pair: Indicate if the input is a pair (sentence 1, sentence 2)
framework: The framework (optional) the tokenizer will generate tensor for
Returns:
Mapping[str, Tensor] holding the kwargs to provide to the model's forward function
"""
input_dict = super().generate_dummy_inputs(tokenizer, batch_size, seq_length, is_pair, framework)
# Generate a dummy bbox
box = [48, 84, 73, 128]
if not framework == TensorType.PYTORCH:
raise NotImplementedError("Exporting LayoutLM to ONNX is currently only supported for PyTorch.")
if not is_torch_available():
raise ValueError("Cannot generate dummy inputs without PyTorch installed.")
import torch
batch_size, seq_length = input_dict["input_ids"].shape
input_dict["bbox"] = torch.tensor([*[box] * seq_length]).tile(batch_size, 1, 1)
return input_dict
onnx_config = LayoutLMv2OnnxConfig(model.config)
export(tokenizer=tokenizer, model=model, config=onnx_config, opset=12, output=Path('onnx/layoutlmv2.onnx'))
Running the export line is raising this error,
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-25-99a1f167e396> in <module>()
----> 1 export(tokenizer=tokenizer, model=model, config=onnx_config, opset=12, output=Path('onnx/layoutlmv2.onnx'))
3 frames
/usr/local/lib/python3.7/dist-packages/transformers/models/layoutlmv2/tokenization_layoutlmv2.py in __call__(self, text, text_pair, boxes, word_labels, add_special_tokens, padding, truncation, max_length, stride, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, **kwargs)
449
450 words = text if text_pair is None else text_pair
--> 451 assert boxes is not None, "You must provide corresponding bounding boxes"
452 if is_batched:
453 assert len(words) == len(boxes), "You must provide words and boxes for an equal amount of examples"
AssertionError: You must provide corresponding bounding boxes
Issue Analytics
- State:
- Created 2 years ago
- Reactions:8
- Comments:22 (5 by maintainers)
Top Results From Across the Web
Export to ONNX - Transformers - Hugging Face
In this guide, we'll show you how to export Transformers models to ONNX (Open Neural Network eXchange). Once exported, a model can be...
Read more >Tutorial 6: Exporting a model to ONNX
We provide a python script to export the pytorch model trained by MMAction2 to ONNX. python tools/deployment/pytorch2onnx.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ...
Read more >Error importing LayoutLMv2ForTokenClassification from ...
... -packages\transformers\models\layoutlmv2\modeling_layoutlmv2.py in ... Pytorch to ONNX export function fails and causes legacy function ...
Read more >Convert Transformers to ONNX with Hugging Face Optimum
If you deploy Transformers models in production environments, we recommend exporting them first into a serialized format that can be loaded, ...
Read more >Contribute to huggingface/transformers · GitHub
Export LayoutLMv2 to onnx Good First Issue. #14368 opened on Nov 11, 2021 by fadi212. 22. LayoutLMv2 model not supporting training on more...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It seems to come from the
LayoutLMv2Tokenizer
which takes boxes (bbox) as inputs. Here you are callingsuper().generate_dummy_inputs
which uses the tokenizer to create dummy inputs, but this does not provide the boxes to the tokenizer, hence the error.There are two ways of solving this issue:
Hi @viantirreau @lalitr994 , You can take a look at this PR and convert your model with this branch. https://github.com/huggingface/transformers/pull/14555