Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Export to ONNX and use ONNX Runtime, working. Guide.

See original GitHub issue

This is an explanation of how to export the recognition model and the detection model to ONNX format. Then a brief explanation of how to use ONNX Runtime to use these models.

ONNX is an intercompatibility standard for AI models. It allows us to use the same model in different types of programming languages, operating systems, acceleration platforms and runtimes. Personally I need to make a C++ build of EasyOCR functionality. After failing, due to several reasons, to make a C++ build using Pytorch and the EasyOCR models, I found that the best solution is to transform the models to ONNX and then program in C++ using ONNX Runtime. Then, compiling is very easy compared to PyTorch.

Due to time constraints I am not presenting a PR. It will be necessary for you to modify a copy of EasyOCR locally.

Requirements

We must install the modules: onnx and onnxruntime. In my case I also had to manually install the protobuf module in version 3.20.

I am using:

EasyOCR 1.5.0
Python 3.9.9
torch 1.10.1
torchvision 0.11.2
onnx 1.11.0
onnxruntime 1.11.1

Exporting ONNX models

The best place to modify the EasyOCR code to export the models is right after EasyOCR uses the loaded model to perform the prediction.

Exporting detection model

In easyocr/detection.py after y, feature = net(x) (line 46) add:

    batch_size_1 = 500
    batch_size_2 = 500
    in_shape=[1, 3, batch_size_1, batch_size_2]
    dummy_input = torch.rand(in_shape)
    dummy_input = dummy_input.to(device)

    torch.onnx.export(
        net.module,
        dummy_input,
        "detectionModel.onnx",
        export_params=True,
        opset_version=11,
        input_names = ['input'],
        output_names = ['output'],
        dynamic_axes={'input' : {2 : 'batch_size_1', 3: 'batch_size_2'}},
    )

We generate a dumb input, totally random, so that onnx can perform the export. It doesn’t matter the input, the important thing is that it has the correct structure. The detection model uses an input that is a 4-dimensional tensor, where the first dimension always has a value of 1, the second a value of 3 and the third and fourth values depend on the resolution of the analyzed image. I have assumed this conclusion after analyzing the data flow, I may be in error and this needs to be corrected.

Note that we export with the parameters (export_params=True) and specify that the two final dimensions of the input tensor are of dynamic size (dynamic_axes=...).

Then we can add this code to immediately import the exported model and validate that it is not corrupted:

onnx_model = onnx.load("detectionModel.onnx")
try:
    onnx.checker.check_model(onnx_model)
except onnx.checker.ValidationError as e:
    print('The model is invalid: %s' % e)
else:
    print('The model is valid!')

Remember to import onnx in the file header.

To run the export just use EasyOCR and perform an analysis on any image indicating the language to be detected. This will download the corresponding model, run the detection and simultaneously export the model. If we change the language we will have to export a new model. Once the model is exported, we can comment or delete the code.

Exporting the recognition model

This model is a bit more difficult to export and we will have to do some black magic.

In easyocr/recognition.py after preds = model(image, text_for_pred) (line 111) add:

    batch_size_1_1 = 500
    in_shape_1=[1, 1, 64, batch_size_1_1]
    dummy_input_1 = torch.rand(in_shape_1)
    dummy_input_1 = dummy_input_1.to(device)

    batch_size_2_1 = 50
    in_shape_2=[1, batch_size_2_1]
    dummy_input_2 = torch.rand(in_shape_2)
    dummy_input_2 = dummy_input_2.to(device)

    dummy_input = (dummy_input_1, dummy_input_2)

    torch.onnx.export(
        model.module,
        dummy_input,
        "recognitionModel.onnx",
        export_params=True,
        opset_version=11,
        input_names = ['input1','input2'],
        output_names = ['output'],
        dynamic_axes={'input1' : {3 : 'batch_size_1_1'}},
    )

As with the detection model, we create a dumb input to be able to export the model. In this case, the model input is 2 elements.

The first element is a 4-dimensional tensor, where the first dimension always has a value of 1, the second a value of 1, the third a value of 64 and the fourth a dynamic value.

The second element is a 2-dimensional tensor, where the first dimension always has a value of 1 and the second a dynamic value.

Again, I may be wrong about the structure of these inputs, it was what I observed empirically.

First strange thing: ONNX for some reason, in performing its analysis of the model structure, concludes that the second input element does not perform any function. So even if we tell ONNX to export a model with 2 input elements, it will always export a model with 1 input element. It appears that this is due to an internal ONNX process where it “cuts” parts of the network defining graph that do not alter the network output. According to the documentation we can stop this “cutting” process and export the network without optimization using the do_constant_folding=False parameter as an option. But due to a bug it is not taking effect. In spite of the above, we can observe that this lack of the second element does not generate losses in the accuracy of the model. For this reason, in the dynamic elements (dynamic_axes=) we only define one element where its third dimension is variable in size. If anyone manages to export the model with the two input elements, it would be appreciated if you could notify us.

Second strange thing: In order to export the recognition model, we must edit easyocr/model/vgg_model.py. It turns out that the AdaptiveAvgPool2d operator is not fully supported by ONNX. When it uses the “None” option, in the configuration tuple (which indicates that the size must be equal to the input), the export fails. To fix this we need to change line 11:

From self.AdaptiveAvgPool = nn.AdaptiveAvgPool2d((None, 1)) to self.AdaptiveAvgPool = nn.AdaptiveAvgPool2d((256, 1))

Why 256? I don’t know. Is there a better option? I have not found one. Does it generate errors in the model? I have not been able to find any accuracy problems. If someone can explain why with 256 it works and what the consequences are, it would be appreciated.

Well then, just like the detection model we can add these lines to validate the exported model:

onnx_model = onnx.load("detectionModel.onnx")
try:
    onnx.checker.check_model(onnx_model)
except onnx.checker.ValidationError as e:
    print('The model is invalid: %s' % e)
else:
    print('The model is valid!')

Remember to import onnx in the file header.

To export the recognition model we must run EasyOCR using any image and the desired language. In the process you will see that some alerts will be generated, but you can ignore them. The model will be exported several times, since the added code has been placed inside a for loop. But this should not cause any problems. Remember to comment or remove the added code afterwards. If you change language, you must export a new ONNX model.

Using ONNX models in EasyOCR

To test and validate that the models work, we will modify the code again. This time we will comment the lines where EasyOCR uses the Pytorch prediction and we will add the code to use ONNX Runtime to perform the prediction.

Using the ONNX detection model

First we must add this helper function to the file easyocr/detection.py:

def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

Then we must comment on linear 46 where it says y, feature = net(x). After this line we must add:

ort_session = onnxruntime.InferenceSession("detectionModel.onnx")
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
y = ort_outs[0]

Remember to import onnxruntime in the file header.

In this way we load the ONNX model of detection and pass as input the value “x”. Since ONNX does not use Pytorch, we must convert “x” from a Tensor to a standard numpy array. Para eso usamos la función de ayuda The output of ONNX is left in the “y” variable.

One last modification must be made on lines 51 and 52. Change from:

score_text = out[:, :, 0].cpu().data.numpy()
score_link = out[:, :, 1].cpu().data.numpy()

score_text = out[:, :, 0]
score_link = out[:, :, 1]

This is because the model output is already a numpy array and does not need to be converted from a Tensor.

To test, we can run EasyOCR with some image and see the result.

Using the ONNX recognition model

We must add the help function to the file easyocr/recognition.py:

def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

Then we must comment on linear 111 to stop using PyTorch prediction: preds = model(image, text_for_pred). And right after that add:

ort_session = onnxruntime.InferenceSession("recognitionModel.onnx")
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(image)}
ort_outs = ort_session.run(None, ort_inputs)
preds = torch.from_numpy(ort_outs[0])

Remember to import onnxruntime in the file header.

We can see how we are only passing one input entity. Although this model, in theory, is supposed to receive two. As with the detection model, the input must be transformed from a Tensor to a numpy array. We convert the output from an array to a Tensor, so that the data flow continues normally.

To test, we can run EasyOCR with some image and see the result.

Others

We can use this function to compare the output of the PyTorch model and the ONNX model to quantify the difference:

np.testing.assert_allclose(to_numpy(<PYTORCH_PREDICTION>), <ONNX_PREDICTION>, rtol=1e-03, atol=1e-05)

In my tests, the difference between the detection models is minimal and passes the test correctly.

In case of the difference in the recognition models, the difference is slightly larger and the test fails. In spite of this it fails by very little and I have not observed failures in the actual recognition of the characters. I don’t know if this is due to the problem with ONNX not detecting the two input entities, the problem with AdaptiveAvgPool2d or just a natural error in the model export and decimal approximations.

Final note

I hope this will be of help to continue with the development of this excellent tool. I hope that exporters in EasyOCR and Pytorch can review this and find the answers to the questions raised.

Issue Analytics

State:
Created a year ago
Reactions:2
Comments:17

Top GitHub Comments

2reactions

Kromtarcommented, Aug 10, 2022

@long-senpai I don’t really know why this happens in the conversion process. I think that ONNX, when optimizing the model, discovers that the weights provided by input 2 are unnecessary; so it deletes them.

The last few weeks I have been working on comparing performances between the models before and after converting.

What I can confirm is that independent of input 2, the output of the converted model is the same as the output of the original model. So don’t worry, there is no loss of performance.

Again, the origin of why ONNX does that, I don’t know.

2reactions

Kromtarcommented, Jul 17, 2022

I have created a new issue where I have made available the ONNX version of the EasyOCR models for all languages. Feel free to download and use them.

Top Results From Across the Web

ONNX Runtime | onnxruntime

ONNX Runtime is a cross-platform machine-learning model accelerator. ... training script using an optimized automatically-exported ONNX computation graph.

Export to ONNX - Transformers - Hugging Face

In this guide, we'll show you how to export Transformers models to ONNX (Open Neural Network eXchange). Once exported, a model can be...

(optional) Exporting a Model from PyTorch to ONNX and ...

This tutorial will use as an example a model exported by tracing. To export a model, we call the torch.onnx.export() function. This will...

Operationalizing PyTorch Models Using ONNX and ... - NVIDIA

Using ONNX and ONNX Runtime ... PyTorch has native support for ONNX export ... Microsoft that works with multiple frameworks and runs everywhere....

Tutorial 8: Pytorch to ONNX (Experimental)

How to convert models from Pytorch to ONNX. Prerequisite. Usage. Description of all arguments · How to evaluate the exported models. Prerequisite. Usage....