ONNX-exported torchvision FasterRCNN fails on inference request
See original GitHub issueDescription
Internal ONNX error related to dims when running an ONNX-exported torchvision
FasterRCNN on TRTIS. Error is as follows:
[E:onnxruntime:, sequential_executor.cc:183 Execute] Non-zero status code returned while running ReduceMax node. Name:'' Status Message: /workspace/onnxruntime/onnxruntime/core/providers/cuda/reduction/reduction_ops.cc:110 onnxruntime::common::Status onnxruntime::cuda::PrepareForReduce(onnxruntime::OpKernelContext*, bool, const std::vector<long int>&, const onnxruntime::Tensor**, onnxruntime::Tensor**, int64_t&, int64_t&, std::vector<long int>&, std::vector<long int>&, std::vector<long int>&) keepdims || dim != 0 was false. Can't reduce on dim with value of 0 if 'keepdims' is false. Invalid output shape would be produced. input_shape:{0,4}\nStacktrace:
TRTIS Information
Running container /w tag nvcr.io/nvidia/tensorrtserver:20.02-py3
To Reproduce
Model is a torchvision.models.detection.FasterRCNN
exported as follows:
outputs = ["boxes", "labels", "scores"]
dynamic_axes_dict = {output_name: {0: "detections"}
for output_name in outputs}
torch.onnx.export(model, images, os.path.join(output_dir, "model.onnx"),
export_params=True, # store weights in the model file
do_constant_folding=True, # const folding for optimization
opset_version=11, # opset vers 11 req for maskrcnn
input_names=["images"],
output_names=outputs,
dynamic_axes=dynamic_axes_dict,
# keep_initializers_as_inputs=True,
verbose=True)
The exported model’s output has been validated against the native Pytorch checkpoint model’s output. Everything seems good when run locally in a onnxruntime
session.
The TRTIS config is as follows:
name: "torch_detection_rcnn"
platform: "onnxruntime_onnx"
input [
{
name: "images"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [ 3, 745, 1324 ]
}
]
output [
{
name: "boxes"
data_type: TYPE_FP32
dims: [ -1, 4 ]
},
{
name: "labels"
data_type: TYPE_INT64
dims: [ -1 ]
},
{
name: "scores"
data_type: TYPE_FP32
dims: [ -1 ]
}
]
The model is loaded by TRTIS with no complaints, but the error occurs during inference request handling.
Expected behavior Given the fact that I’m able to validate the model & load it onto TRTIS just fine, I expected it to handle requests just fine.
I’m not sure where to go from here, so I had a few questions.
- Is there a particular opset we should be exporting with? I’m using 11 due to
torch
necessitating it. - Should we be exporting it with all initializers kept as inputs? I tried this and ran into a different error that was more vague.
I’m at a bit of a roadblock so let me know if you need any more information. Also, please let me know if y’all have any suggestions for what to look into and experiment with from here.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Is this still not working? Is so please reopen.
Apologies for the delayed response. This issue should remain closed, but I just wanted to update the info here.
I believe the problem here arises from the dynamic axes (for number of detections). It appears that something in the ONNX code doesn’t account for dynamic axes being potentially 0 (no detections).
I simply modified the graph before exporting to pad all output tensors to a fixed number of elements and was able to bypass this issue.