question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Yolo-v3-tiny-tf model with INT-8 precision gives bad inferences

See original GitHub issue

Hi We are working on integrating yolo-v3-tiny-tf int-8 IR model into the dlstreamer pipeline following the documentation provided for changing model We were able to integrate yolo-v3-tiny-tf IR model (non quantized) and test it. but we failed to get proper inference with the INT-8 models for the same. these converted INT-8 model were validated using open model zoo sample for object detection and it was giving a proper inference.

The steps follow for conversion of the yolo-v3-tiny-tf to int8 model are provided below:

This Quantization document is based on yolo_v3_tiny_tf model

Requirements

  • Openvino-dev 2022.1
  • Openvino 2022.1

Steps for Quantization

Step 1: Obtain the OMZ model (yolo_v3_tiny_tf)

 omz_downloader --name yolo_v3_tiny_tf
 omz_converter --name yolo_v3_tiny_tf

This step downloads the frozen model and converts them to it’s appropriate IR representation.

Step 2 : Obtain the DataSet for Optimization

For this Model COCO 2017 Validation dataset was selected

wget http://images.cocodataset.org/zips/val2017.zip

unzip val2017.zip

Step 3: Create a Json <quantization_spec.json> (Optional instead of pot arguments)

Note: Use FP16 to convert to FP16-INT8

{

        "model": {
                "model_name": "yolo-v3-tiny-tf",
                "model": "<path to the model>/yolo-v3-tiny-tf/FP32/yolo-v3-tiny-tf.xml",
                "weights": "<path to the model>/yolo-v3-tiny-tf/FP32/yolo-v3-tiny-tf.bin"
        },

        "engine": {
            "type": "simplified",
                "data_source": "<path to the dataset where the images are stored>/val2017"
        },

        "compression": {
            "target_device": "CPU",
             "algorithms": [
                {
                "name": "DefaultQuantization",
                "params": {
                            "preset": "performance",
                            "stat_subset_size": 300,
                            "shuffle_data": false
                }
                }
        ]
        }
}

Step 4: Use post optimization tool of the openvino to finish the process

This step converts the FP32/FP16 models to its FP32-INT8/FP16-INT8 models

the INT8 models will be available in “yolov3_int8” directory

pot -c quantization_spec.json --output-dir yolov3_int8 -d

Step 5 : Validation

Test the converted model with open model zoo demo object detection sample.

 python3 object_detection_demo.py   -d CPU   -i <path to the input video>  -m <path to INT8 model xml>   -at yolo   --labels <OMZ_DIR>/data/dataset_classes/coco_80cl.txt

For integrating the pipeline server the steps followed are as per the document

copy the downloaded and converted model under

<pipeline-server>/models/object_detection/yolo-v3-tiny-tf

Directory structure looks something like this under yolo-v3-tiny-tf

coco-80cl.txt  FP16  FP32  FP32-INT8  yolo-v3-tiny-tf  yolo-v3-tiny-tf.json

Created new pipeline

cp -r pipelines/gstreamer/object_detection/person_vehicle_bike pipelines/gstreamer/object_detection/yolo-v3-tiny-tf

Edited the template of pipeline.json

sed -i -e s/\\[person_vehicle_bike\\]/\\[yolo-v3-tiny-tf\\]/g pipelines/gstreamer/object_detection/yolo-v3-tiny-tf/pipeline.json

Ran the pipeline server

./docker/run.sh -v /tmp:/tmp --models models --pipelines pipelines/gstreamer

with this we were able to the inferencing with FP16, FP32 but we were not able to do inference with FP32-INT8 IR model

Could you please let us know what steps we are missing to integrate Quantized model ?

Thanks

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
AishaSamaanKhancommented, Jul 22, 2022

@whbruce These are the same model that i used for the pipeline server I have tried this the open model zoo sample openvino 2022.1.0 version, Below are the logs and attachment for the same

(setup) intel@intel-WL10:~/workspace/open_model_zoo/demos/object_detection_demo/python$ python3 object_detection_demo.py -d CPU -i bottle.mp4 -m /home/intel/workspace/pipeline-server/models/object_detection/yolo-v3-tiny-tf/FP32-INT8/yolo-v3-tiny-tf.xml -at yolo --labels /home/intel/workspace/pipeline-server/models/object_detection/yolo-v3-tiny-tf/coco-80cl.txt [ INFO ] OpenVINO Runtime [ INFO ] build: 2022.1.0-7019-cdb9bec7210-releases/2022/1 [ INFO ] Reading model /home/intel/workspace/pipeline-server/models/object_detection/yolo-v3-tiny-tf/FP32-INT8/yolo-v3-tiny-tf.xml [ WARNING ] The parameter “input_size” not found in YOLO wrapper, will be omitted [ WARNING ] The parameter “num_classes” not found in YOLO wrapper, will be omitted [ INFO ] Input layer: image_input, shape: [1, 416, 416, 3], precision: f32, layout: NHWC [ INFO ] Output layer: conv2d_12/Conv2D/YoloRegion, shape: [1, 255, 26, 26], precision: f32, layout: [ INFO ] Output layer: conv2d_9/Conv2D/YoloRegion, shape: [1, 255, 13, 13], precision: f32, layout: [ INFO ] The model /home/intel/workspace/pipeline-server/models/object_detection/yolo-v3-tiny-tf/FP32-INT8/yolo-v3-tiny-tf.xml is loaded to CPU [ INFO ] Device: CPU [ INFO ] Number of streams: 4 [ INFO ] Number of threads: AUTO [ INFO ] Number of model infer requests: 5 [ INFO ] Metrics report: [ INFO ] Latency: 84.1 ms [ INFO ] FPS: 30.5 [ INFO ] Decoding: 0.4 ms [ INFO ] Preprocessing: 0.6 ms [ INFO ] Inference: 81.0 ms [ INFO ] Postprocessing: 1.9 ms [ INFO ] Rendering: 0.2 ms Capture

0reactions
AishaSamaanKhancommented, Jul 25, 2022

yes, @brmarkus
With the downloaded models from the open model zoo (i.e.,The Original models), the pipeline server is able to give the inference, but when it comes the INT-8 Quantized model based on Accuracy Aware Algo the inference is failing with the pipeline server These same Quantized model were validated with the object detection sample from the OMZ repo.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TLT YOLOv4 (CSPDakrnet53) - TensorRT INT8 model gives ...
Problem We have trained and tested TLT YOLOv4(CSPDarknet52 and resnet18) models with a dataset of person class with TLT.
Read more >
Questions about int8 inference procedure · Issue #1271 - GitHub
Yes, I tried with 8.0EA. my model is simple transformer encoder. but, it is hard to know which part causes above error.
Read more >
Model Optimizer Frequently Asked Questions
A : Model Optimizer tried to infer a specified layer via the Caffe framework. However, it cannot construct a net using the Caffe...
Read more >
Release Notes for Intel® Distribution of OpenVINO™ toolkit ...
AI Inference is Optimized for Amazing Results with the 2022.1 Release. This release (available March 2022) brings expanded natural language ...
Read more >
GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Abstract: Large language models have been widely adopted but require significant GPU memory for inference. We develop a procedure for Int8 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found