question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to deploy Detectron2 model using pytorch?

See original GitHub issue

I export detectron2 model to torchscript and try to deploy in triton server *I use detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

my config.pbtxt is below

name: "testmodel"
platform: "pytorch_libtorch"
max_batch_size: 0
input [
  {
    name: "conv2d__0"
    data_type: TYPE_FP32
    dims: [1056, 1920,3]
  }
]
output [
  {
    name: "bboxex__0"
    data_type: TYPE_FP32
    dims: [-1,4]
  },
  {
    name: "scores__1"
    data_type: TYPE_FP32
    dims: [-1]
  },
  {
    name: "classes__2"
    data_type: TYPE_INT32
    dims: [-1]
  },
  {
    name: "masks__3"
    data_type: TYPE_BOOL
    dims: [-1]
  }
]

I inference to model but I receive an error

File "code/__torch__/detectron2/modeling/meta_arch/rcnn.py", line 348, in forward
    _9 = torch.slice(max_size, 0, -2, 9223372036854775807, 1)
    _10 = torch.add(_9, CONSTANTS.c2, alpha=1)
    _11 = torch.floor_divide(_10, CONSTANTS.c3)
          ~~~~~~~~~~~~~~~~~~ <--- HERE
    max_size0 = torch.cat([_8, torch.mul(_11, CONSTANTS.c3)], 0)
    h = ops.prim.NumToTensor(torch.size(t, 1))
Traceback of TorchScript, original code (most recent call last):
/usr/local/lib/python3.6/dist-packages/torch/tensor.py(424): __floordiv__
/usr/local/lib/python3.6/dist-packages/torch/tensor.py(22): wrapped
/usr/local/lib/python3.6/dist-packages/detectron2/structures/image_list.py(98): from_tensors
/usr/local/lib/python3.6/dist-packages/detectron2/modeling/meta_arch/rcnn.py(222): preprocess_image
/usr/local/lib/python3.6/dist-packages/detectron2/modeling/meta_arch/rcnn.py(196): inference
/usr/local/lib/python3.6/dist-packages/detectron2/modeling/meta_arch/rcnn.py(149): forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(704): _slow_forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(720): _call_impl
/root/detectron2/tools/deploy/mymodel.py(70): forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(704): _slow_forward
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py(720): _call_impl
/usr/local/lib/python3.6/dist-packages/torch/jit/__init__.py(1109): trace_module
/usr/local/lib/python3.6/dist-packages/torch/jit/__init__.py(955): trace
<stdin>(1): 
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

before this error I receive any times same device error and I modify cpu operations to gpu operations like .cuda()

How should I solve this problem? Or is there a way to deploy the detectron2 model?

thank you for reading

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:24 (7 by maintainers)

github_iconTop GitHub Comments

10reactions
CescMessicommented, Nov 19, 2021

I had the same problem and found the solution. This problem is indeed caused by the tensor max_size, which is a tensor in cpu. When I modified the detectron2 code to move the variable to the gpu, Triton worked fine. The code is here, I just added .to('cuda') to make it work. Another solution is to modify the torchscript code. Just unzip torchscript model file and modify the corresponding code. In FasterRCNN model, the code is in archive/code/__torch__/detectron2/export/flatten.py, and add a line max_size = torch.to(max_size, dtype=4, layout=0, device=torch.device("cuda"), pin_memory=None, non_blocking=False, copy=False, memory_format=None) after max_size appear. Then zip the folder, it can also work in Triton.

I don’t know if the problem comes from detectron2 or Triton, although the problem was solved by modifying the detectron2 code, the original torchscript model works fine in pytorch. I hope the bug can be fixed soon. @stella-ds @CoderHam

2reactions
CoderHamcommented, Oct 9, 2020

There will be a fix for the torchvision build in the upcoming release that will allow you to run Detectron 2 successfully.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deployment — detectron2 0.6 documentation
Models written in Python need to go through an export process to become a deployable artifact. ... scripting : see pytorch documentation to...
Read more >
Your Guide to Object Detection with Detectron2 in PyTorch
In this article, I'll perform object detection using a recent, robust model called Detectron2. I'll be using PyTorch for the code.
Read more >
Detectron2 - Object Detection with PyTorch - Gilbert Tanner
Detectron2 - Object Detection with PyTorch · Install Detectron2 · Install using Docker · Inference with a pre-trained model · Train a custom...
Read more >
Object Detection with PyTorch and Detectron2
Detectron2 includes all the models that were available in the original ... sudo apt-get install pyqt5-dev-tools sudo pip3 install -r ...
Read more >
[P] Deploy Detectron2 models with Triton inference server
Detectron2 is a very popular PyTorch-based library for detection tasks. Although it has optimal implementations of many detection and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found