Triton cannot inference `tf.math.l2_normalize` correctly from ngc 21.06 ~ ngc 22.03 ( triton 2.20.0)
See original GitHub issueDescription
I’ve converted layers from tensorflow to TensorRT.
This model’s result is valid until ngc 21.05.
From 21.06 to 22.01, TensorRT models containing layers converted from tf.math.l2_normalize
cause wrong result in Triton inference server.
Expected values are in range between 0.0~1.0, but it generates wrong values from 1e+2
to 1e+5
like exploding.
I’ve get correct result without Triton inference with script(https://gist.github.com/kimdwkimdw/2188b4b9ff9f9bf2f7e8dbd999769796)
reference: https://github.com/NVIDIA/TensorRT/issues/1707
Triton Information
NGC tags until 21.05-py: valid result NGC tags from 21.06-py to 22.01-py3: wrong result
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver
To Reproduce Steps to reproduce the behavior.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Any network from tensorflow including tf.math.l2_normalize
operation converting from ONNX OPSET from 11 to 13.
Next, convert onnx model to TensortRT model with tensorrt with 21.06 to 22.01(https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt)
Expected behavior A clear and concise description of what you expected to happen.
Expected values are in range between 0.0~1.0, but it generates wrong values from 1e+2
to 1e+5
like exploding.
Issue Analytics
- State:
- Created 2 years ago
- Comments:13 (8 by maintainers)
Top GitHub Comments
Thanks for waiting. This should be fixed now in TRT 8.5.
Thank you for the update. @yuanyao-nv is investigating.
(@cnegron-nv, I’m leaving this ticket open for now. Feel free to update me or the ticket if it should be closed, since the issue is happening within TensorRT and there’s an open TensorRT GitHub issue for it.)