Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Yolov5s torchscript model shows pytorch backend bugs?

See original GitHub issue

Bug This is a bug shows on a exported yolov5s traced torchscript model on triton inference server.

Environment

OS: Ubuntu 20.04
GPU: RTX 3090

To Reproduce I first export the yolov5s model to torchscript with batch size 8, img size 320 with their models/export.py script. Then I use this model on triton inference server docker container with nvcr.io/nvidia/tritonserver:20.12-py3 image, with the following triton inference server configpb.txt When I inference to this model

name: "model"
platform: "pytorch_libtorch"
max_batch_size: 8

input {
    name: "input__0"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [3,-1,-1]
}
output [
    {name: "output__0"
    data_type: TYPE_FP32
    dims: [3,-1,-1,-1]
    },
    {name: "output__1"
    data_type: TYPE_FP32
    dims: [3,-1,-1,-1]
    },
    {name: "output__2"
    data_type: TYPE_FP32
    dims: [3,-1,-1,-1]
    }
]
instance_group [
  {
    count: 1
    kind: KIND_GPU
  }
]

Triton Information docker image nvcr.io/nvidia/tritonserver:20.12-py3

Description Output:

ferenceServerException: PyTorch execute failure: isTensor() INTERNAL ASSERT FAILED at "/opt/tritonserver/include/torch/ATen/core/ivalue_inl.h":137, please report a bug to PyTorch. Expected Tensor but got GenericList
Exception raised from toTensor at /opt/tritonserver/include/torch/ATen/core/ivalue_inl.h:137 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6c (0x7f1c9112c6cc in /opt/tritonserver/backends/pytorch/libc10.so)
frame #1: <unknown function> + 0x29346 (0x7f1c91678346 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #2: <unknown function> + 0x1320d (0x7f1c9166220d in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #3: <unknown function> + 0x18ee3 (0x7f1c91667ee3 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #4: TRITONBACKEND_ModelInstanceExecute + 0x387 (0x7f1c91669247 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #5: <unknown function> + 0x2da9b7 (0x7f1ce2ab79b7 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #6: <unknown function> + 0xf1240 (0x7f1ce28ce240 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #7: <unknown function> + 0xd6d84 (0x7f1ce2317d84 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #8: <unknown function> + 0x9609 (0x7f1ce27b2609 in /usr/lib/x86_64-linux-gnu/libpthread.so.0)
frame #9: clone + 0x43 (0x7f1ce2005293 in /usr/lib/x86_64-linux-gnu/libc.so.6)

Expected behavior I tested the torchscript model on normal python which will get a result of following in pytorch.

torch.Size([1, 3, 24, 40, 6])
torch.Size([1, 3, 12, 20, 6])
torch.Size([1, 3, 6, 10, 6])

However, when I inference with triton inference client, httpclient.InferenceServerClient it shows the above errors.

Issue Analytics

State:
Created 3 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

1reaction

CoderHamcommented, Jan 5, 2021

Since The LibTorch backend makes an assumption that causes it to only support each inputs/outputs via a single Tensor and not List of Tensors or GenericList, you must create a wrapper code around the Yolov5 model such that the input and output of the traced model conserves this assumption. This should be straightforward.

The PyTorch community uses a list of tensors instead of a single tensor as I/O for many detection models and this is a common issue we have seen. However, due to the lack of metadata available from the torchscript model, Triton must operate with the aforementioned assumption.

0reactions

QingYuan-Lcommented, Oct 28, 2021

@luvwinnie hi man, I solved the problem by modify the model forward， here,in yolov5, return x if self.training else (torch.cat(z, 1), x) to return x if self.training else torch.cat(z, 1) then export again. and you can change the batchsize to 1,so that the triton can send input data, and receive the single tensor