Onnxruntime execute failure
See original GitHub issuePreviously, I used the 21.03 container for serving my onnx model (exported Nvidia Citrinet ASR model). Everything worked fine. Right now I need to use version 21.07+, but I get this creepy error:
onnxruntime execute failure 1: Non-zero status code returned while running FusedConv node. Name:'Conv_35_Add_36_Relu_37' Status Message: CUDNN error executing cudnnAddTensor(Base::CudnnHandle(), &alpha, Base::s_.z_tensor, Base::s_.z_data, &alpha, Base::s_.y_tensor, Base::s_.y_data)
Just in case, I reconverted Citrinet to onnx with the latest onnxruntime version (1.9.0, but also tried 1.8.1) and checked everything again, nothing changed.
I checked some containers: 21.03-py3 - everything works fine, I can get server responses with correct inference output data. 21.06-py3, 21.07-py3, 21.08-py3 - getting error (pasted it above).
CUDA: 11.4, Driver Version: 470.57.02
What can you advise me to run inference in new versions of the container correctly?
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:5 (2 by maintainers)
Top GitHub Comments
This “workaround” slashes the performance. I feel like this should be reopened.
@CoderHam I found a workaround, everything works fine with
optimization {graph : {level : -1}}
in config.pbtxt (with no fusions optimization). I got this idea from here. And, of course, everything is ok when I use CPU for inference instead of GPU.Also, according to Onnxruntime releases, v.1.8.2 has “fixed a crash issue when optimizing Conv->Add->Relu”. However, building the ONNX Runtime backend with v1.8.2+ for Triton didn’t help me. Maybe this information will be useful to you in some way.