question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Deploy Detectron2 Mask R-CNN inside Triton

See original GitHub issue

Description

  • With Detectron2, I have trained the R-CNN Mask model, which is based on the following architecture: link to yaml file.

  • I converted my model to TorchScript format using script provided by Detectron2 team: link to script, so it is now in .pt format.

  • I prepared the config.pbtxt file and created a model repository as described in your documentation and put config and trained model there.

    • Structure of model repo
    models_torchscript
     └ mask_rcnn
      ├ config.pbtxt
      └ 1
        └ model.pt
    
    • Content of config.pbtxt
    name: "mask_rcnn"
    platform: "pytorch_libtorch"
    max_batch_size: 0
    input [
        {
                name: "INPUT__0"
                data_type: TYPE_FP32
                dims: [1, 3, 800, 800]
        },
        {
                name: "INPUT__1"
                data_type: TYPE_FP32
                dims: [1, 1, 3]
        }
    ]
    output [
        {
                name: "OUTPUT__0"
                data_type: TYPE_FP32
                dims: [16]
        },
        {
                name: "OUTPUT__1"
                data_type: TYPE_FP32
                dims: [16]
        }
    ]
    
  • I deploy model inside the server with the command

    docker run \
    --gpus=1 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p8000:8000 -p8001:8001 -p8002:8002 \
    -v $PWD/models_torchscript:/models \
    nvcr.io/nvidia/tritonserver:20.08-py3 tritonserver \
    --model-repository=/models \
    --strict-model-config=false \
    --log-verbose=1
    
  • Server is starting (no errors)

  • When trying to infer using your client lib (sorry for code it is really quick and dirty script)

image = np.zeros((1, 3, 800, 800)).astype(np.float32)
    im_info = np.float32((800, 800, 1))
    im_info = np.reshape(im_info, (1, -1))
    im_info = np.expand_dims(im_info, axis=0)

    dtype = "FP32"

    input_1 = httpclient.InferInput("INPUT__0", image.shape, dtype)
    input_1.set_data_from_numpy(image, binary_data=False)

    input_2 = httpclient.InferInput("INPUT__1", im_info.shape, dtype)
    input_2.set_data_from_numpy(im_info, binary_data=False)

    inputs = [input_1, input_2]

    output_1 = httpclient.InferRequestedOutput("OUTPUT__0", binary_data=False, class_count=1)
    output_2 = httpclient.InferRequestedOutput("OUTPUT__1", binary_data=False, class_count=1)

    outputs = [output_1, output_2]

    response = triton_client.infer(FLAGS.model_name, inputs, request_id=str("loool"), model_version=FLAGS.model_version, outputs=outputs)
  • Get error
I0915 18:11:15.486762 1 libtorch_backend.cc:552] Running mask_rcnn_0_gpu0 with 1 requests
I0915 18:11:15.486818 1 pinned_memory_manager.cc:130] pinned memory allocation: size 7680000, addr 0x7fcfa8000090
I0915 18:11:15.488582 1 pinned_memory_manager.cc:130] pinned memory allocation: size 12, addr 0x7fcfa87530a0
I0915 18:11:15.490377 1 libtorch_backend.cc:776] Expected at most 2 argument(s) for operator 'forward', but received 3 argument(s). Declaration: forward(__torch__.detectron2.export.caffe2_modeling.___torch_mangle_857.Caffe2GeneralizedRCNN self, (Tensor, Tensor) argument_1) -> ((Tensor, Tensor, Tensor, Tensor))
Exception raised from checkAndNormalizeInputs at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/function_schema_inl.h:245 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fd0447df94b in /opt/tritonserver/lib/pytorch/libc10.so)
frame #1: <unknown function> + 0x82a067 (0x7fd0c19a7067 in /opt/tritonserver/lib/pytorch/libtorch_cpu.so)
frame #2: torch::jit::GraphFunction::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) + 0x2d (0x7fd0c3a91cad in /opt/tritonserver/lib/pytorch/libtorch_cpu.so)
frame #3: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&) + 0x109 (0x7fd0c3aa19d9 in /opt/tritonserver/lib/pytorch/libtorch_cpu.so)
frame #4: <unknown function> + 0x27fc87 (0x7fd0e3a7cc87 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #5: <unknown function> + 0x286e4d (0x7fd0e3a83e4d in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #6: <unknown function> + 0x98000 (0x7fd0e3895000 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #7: <unknown function> + 0xafaf7 (0x7fd0e38acaf7 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #8: <unknown function> + 0xbd6df (0x7fd0e27996df in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #9: <unknown function> + 0x76db (0x7fd0e35e56db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #10: clone + 0x3f (0x7fd0e1e56a3f in /lib/x86_64-linux-gnu/libc.so.6)
  • I have no idea how to solve that issue. Could anybody help me out?

Triton Information What version of Triton are you using? 20.08

Are you using the Triton container or did you build it yourself? I’m using Triton container: nvcr.io/nvidia/tritonserver:20.08-py3

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:25 (5 by maintainers)

github_iconTop GitHub Comments

16reactions
SkalskiPcommented, Sep 23, 2020

Letter to the people from the future

Below you will find an approximate path to deploy Detectron2 Mask R-CNN inside the Triton Inference Server:

  • Train your Detectron2 Mask R-CNN model in Python

  • Convert Detectron2 model to TorchScript using this script

  • Original model in Detectron2 requires only an image tensor to make an inference. However model in TorchScript requires an additional tensor with information about the image dimensions. Those two arguments need to form a tuple - Tuple[Tensor, Tensor]. According to Detectron2 documentation:

    All converted models (the .pb files) take two input tensors: “data” is an NCHW image, and “im_info” is an Nx3 tensor consisting of (height, width, 1.0) for each image (the shape of “data” might be larger than that in “im_info” due to padding).

    The problem is however, that Triton does not allow for passing Tuple as argument to neural network forward pass. As a workaround, we can wrap the model into other dummy model that will accept two separate arguments of type Tensor and build Tuple inside forward method.

    class Wrapper(torch.nn.Module):
        def __init__(self):
            super(Wrapper, self).__init__()
            self.model = torch.jit.load(SOURCE_MODEL_PATH).to(device)
    
        def forward(self, x: torch.Tensor, y: torch.Tensor):
            return self.model.forward((x, y))
    
    m = torch.jit.script(Wrapper())
    m.save(TARGET_MODEL_PATH)
    
  • Create a model repository on the host machine, as described in the documentation. Put your output model.pt file in correct place in that folder structure.

    models_torchscript
    └─ mask_rcnn
       ├─ config.pbtxt
       ├─ 1
       │  └─ model.pt
       └─ 2
          └─ model.pt
    
  • Create config.pbtxt file with the content below (model configuration documentation)

    name: "mask_rcnn"
    platform: "pytorch_libtorch"
    max_batch_size: 0
    input [
        {
        	  name: "INPUT__0"
        	  data_type: TYPE_FP32
        	  dims: [1, 3, 800, 800]
        },
        {
        	  name: "INPUT__1"
        	  data_type: TYPE_FP32
        	  dims: [1, 3]
        }
    ]
    output [
        {
        	  name: "OUTPUT__0"
        	  data_type: TYPE_FP32
        	  dims: [-1,4]
        },
        {
        	  name: "OUTPUT__1"
        	  data_type: TYPE_FP32
        	  dims: [-1]
        },
        {
        	  name: "OUTPUT__2"
        	  data_type: TYPE_FP32
        	  dims: [-1]
        },
        {
        	  name: "OUTPUT__3"
        	  data_type: TYPE_FP32
        	  dims: [-1,1,28,28]
        }
    ]
    
  • Run server:

    docker run --gpus=1 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
    -p8000:8000 -p8001:8001 -p8002:8002 \
    -v $PWD/models_torchscript/:/models \
    nvcr.io/nvidia/tritonserver:20.08-py3 tritonserver \
    --model-repository=/models \
    --strict-model-config=false \
    --log-verbose=1
    
1reaction
CoderHamcommented, Sep 22, 2020

Currently Tritonserver does not current support such complex structures.

The Libtorch (PyTorch) backend operates with the assumption that the inputs to the model are tensors and not tuple of tensors. I’d recommend you to build a wrapper around your model and trace it to produce a version of your model where the inputs are tensors instead of a tuple of tensors. (i.e. pass a 4D tensor and convert into tuple of 3D tensors inside model before passing to detectron2) PS: The above workaround worked for someone with MaskRCNN.

Closing this issue. Please re-open if the above WAR does not solve your problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deployment — detectron2 0.6 documentation
Models written in Python need to go through an export process to become a deployable artifact. ... C++ examples for Mask R-CNN are...
Read more >
Convert Detectron2 model to TensorRT - Medium
This post covers the steps needed to convert a Detectron2 (MaskRCNN) model to TensorRT format and deploy it on Triton Inference Server.
Read more >
[P] Deploy Detectron2 models with Triton inference server
I share in the post below how I deployed Detectron2 models with Triton inference server (an inference system developed by NVIDIA).
Read more >
Train MaskRCNN on custom dataset with Detectron2 in 4 steps
In this tutorial, I explain step-by-step training MaskRCNN on a custom dataset using Detectron2, so you can see how easy it is in...
Read more >
Developing and Deploying Your Custom Action Recognition ...
In this post, we show how you can fast-track your AI application development by taking a pretrained action recognition model, fine-tuning it ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found