Yolov5s torchscript model shows pytorch backend bugs?
See original GitHub issueBug This is a bug shows on a exported yolov5s traced torchscript model on triton inference server.
Environment
- OS: Ubuntu 20.04
- GPU: RTX 3090
To Reproduce I first export the yolov5s model to torchscript with batch size 8, img size 320 with their models/export.py script. Then I use this model on triton inference server docker container with nvcr.io/nvidia/tritonserver:20.12-py3 image, with the following triton inference server configpb.txt When I inference to this model
name: "model"
platform: "pytorch_libtorch"
max_batch_size: 8
input {
name: "input__0"
data_type: TYPE_FP32
format: FORMAT_NCHW
dims: [3,-1,-1]
}
output [
{name: "output__0"
data_type: TYPE_FP32
dims: [3,-1,-1,-1]
},
{name: "output__1"
data_type: TYPE_FP32
dims: [3,-1,-1,-1]
},
{name: "output__2"
data_type: TYPE_FP32
dims: [3,-1,-1,-1]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
}
]
Triton Information docker image nvcr.io/nvidia/tritonserver:20.12-py3
Description Output:
ferenceServerException: PyTorch execute failure: isTensor() INTERNAL ASSERT FAILED at "/opt/tritonserver/include/torch/ATen/core/ivalue_inl.h":137, please report a bug to PyTorch. Expected Tensor but got GenericList
Exception raised from toTensor at /opt/tritonserver/include/torch/ATen/core/ivalue_inl.h:137 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6c (0x7f1c9112c6cc in /opt/tritonserver/backends/pytorch/libc10.so)
frame #1: <unknown function> + 0x29346 (0x7f1c91678346 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #2: <unknown function> + 0x1320d (0x7f1c9166220d in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #3: <unknown function> + 0x18ee3 (0x7f1c91667ee3 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #4: TRITONBACKEND_ModelInstanceExecute + 0x387 (0x7f1c91669247 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #5: <unknown function> + 0x2da9b7 (0x7f1ce2ab79b7 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #6: <unknown function> + 0xf1240 (0x7f1ce28ce240 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #7: <unknown function> + 0xd6d84 (0x7f1ce2317d84 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #8: <unknown function> + 0x9609 (0x7f1ce27b2609 in /usr/lib/x86_64-linux-gnu/libpthread.so.0)
frame #9: clone + 0x43 (0x7f1ce2005293 in /usr/lib/x86_64-linux-gnu/libc.so.6)
Expected behavior I tested the torchscript model on normal python which will get a result of following in pytorch.
torch.Size([1, 3, 24, 40, 6])
torch.Size([1, 3, 12, 20, 6])
torch.Size([1, 3, 6, 10, 6])
However, when I inference with triton inference client, httpclient.InferenceServerClient
it shows the above errors.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (2 by maintainers)
Top GitHub Comments
Since The LibTorch backend makes an assumption that causes it to only support each inputs/outputs via a single Tensor and not List of Tensors or GenericList, you must create a wrapper code around the Yolov5 model such that the input and output of the traced model conserves this assumption. This should be straightforward.
The PyTorch community uses a list of tensors instead of a single tensor as I/O for many detection models and this is a common issue we have seen. However, due to the lack of metadata available from the torchscript model, Triton must operate with the aforementioned assumption.
@luvwinnie hi man, I solved the problem by modify the model forward, here,in yolov5,
return x if self.training else (torch.cat(z, 1), x)
toreturn x if self.training else torch.cat(z, 1)
then export again. and you can change the batchsize to 1,so that the triton can send input data, and receive the single tensor