Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Caffe2 export of quantized model

See original GitHub issue

Hello, everyone

First of all I’ve finally obtained results in a fight with ONNX export of quantized model built on detectron2 and trained using PyTorch QAT tools. So if anyone interested in guidance in this or in PR of additional functionality in detectron2 please write me back.

I understand that export of quantized model using detectron2/export tools might not be supported for now. But there are frequent mentions of INT8 in a code that’s why I’ll ask the following.

While I do conversion after I see: ONNX export Done. Exported predict_net (before optimizations)

Code logs an error:

[W init.h:137] Caffe2 GlobalInit should be run before any other API calls.
[W tensor.cc:167] Changing the data type of Tensor is discouraged. Attempt to change data type from: nullptr (uninitialized) to: unsigned char
Traceback (most recent call last):
  File "./tools/caffe2_converter.py", line 111, in <module>
    caffe2_model = export_caffe2_model(cfg, model, first_batch)
  File "/root/some_detectron2/detectron2/export/api.py", line 157, in export_caffe2_model
    return Caffe2Tracer(cfg, model, inputs).export_caffe2()
  File "/root/some_detectron2/detectron2/export/api.py", line 95, in export_caffe2
    predict_net, init_net = export_caffe2_detection_model(model, inputs)
  File "/root/some_detectron2/detectron2/export/caffe2_export.py", line 171, in export_caffe2_detection_model
    init_net = construct_init_net_from_params(params, device_options)
  File "/root/some_detectron2/detectron2/export/shared.py", line 312, in construct_init_net_from_params
    [create_const_fill_op(name, blob, device_option=device_options.get(name, None))]
  File "/root/some_detectron2/detectron2/export/shared.py", line 290, in create_const_fill_op
    assert device_option is None
AssertionError

The model takes floats as input and returns floats as outputs, inputs are implicitly quantized at the beggining of the network.

Is it possible to properly configurate export API functionality to export qunatized model? Is device_option assertion is vital in such exporting?

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:24 (1 by maintainers)

Top GitHub Comments

4reactions

ppwwyyxxcommented, Jun 3, 2020

It does not include pytorch/caffe2 modifications. We would like to open source them eventually but we likely won’t have time to work on this in the near (3 months) future.

2reactions

zetyquicklycommented, Jun 9, 2020

btw, https://github.com/sstsai-adl/workshops/tree/master/LPCV_2020/uav_video_challenge contains a quantized CPU model for text detection. It may give an idea of what the model looks like in the end. They were released by a different team for an inference demo only.

Thanks for your concern, I’ll add something on that.

For anyone who wants to export quantized detectron2 model in TorchScript I would recommend do monkey patching of forward, __init__ and fuse functions of layers that are used in your meta architecture , e.g. GeneralizedRCNN. Then do quantization preparation, fusion and do QAT or static quantization (see official docs). After that you can trace the model manually or use experimental TorchScript tracing functionality of detectron2/export. It should do the trick if you want to deploy in PC CPU

But if you are interested in mobile deployment of a model, for now TorchScript is not a working approach

Top Results From Across the Web

ONNX export of quantized model - #17 by mhamdan

Hi, I've tried to export simple model using ONNX export and faced an error that ask me to report a bug. import torch...

onnx/quantization - Gitter

My understand is current quantized data representation in ONNX is just as defined by Caffe2 ? https://caffe2.ai/docs/operators-catalogue.html# ...

Exporting transformers models - Hugging Face

When exporting quantized model you will end up with two different ONNX files. The one specified at the end of the above command...

Post-training quantization | TensorFlow Lite

Post-training quantization is a conversion technique that can reduce model size while also improving CPU and hardware accelerator latency, ...

Snapdragon Neural Processing Engine SDK: Quantized vs ...

The default output of snpe-caffe-to-dlc and snpe-caffe2-to-dlc is a non-quantized model. This means that all the network parameters are left in the 32 ......