Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Torchvision ops not compiled with GPU support

See original GitHub issue

Description Running FasterRCNN model exported to TorchScript fails because of torch.ops.torchvision.nms is not compiled with GPU support.

Triton Information What version of Triton are you using? nvcr.io/nvidia/tritonserver:20.07-py3

Are you using the Triton container or did you build it yourself? container

To Reproduce

Export model using nvcr.io/nvidia/pytorch:20.07-py3 container:

Faster-RCNN model is wrapped so inputs and output are compatible with Triton.

import torch
import torchvision


class FasterRCNNWrapper(torch.nn.Module):

    def __init__(self):
        super().__init__()

        self.model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

    def forward(self, x):
        losses, detections = self.model([x])
        return detections[0]['boxes'], detections[0]['labels'], detections[0]['scores']


model = FasterRCNNWrapper()

script = torch.jit.script(model)
script.save('model.pt')

Prepare model repository:

I’m using following dir structure:

models:
    od:
        0:
            model.pt
        config.pbtxt

config.pbtxt

name: "od"
platform: "pytorch_libtorch"
input [
  {
    name: "inputs__0"
    data_type: TYPE_FP32
    dims: [ 3, -1, -1 ]
  }
]
output [
  {
    name: "boxes__0"
    data_type: TYPE_FP32
    dims: [ 1000, 4  ]
  },
  {
    name: "labels__1"
    data_type: TYPE_INT64
    dims: [ 1000 ]
  },
  {
    name: "scores__2"
    data_type: TYPE_FP32
    dims: [ 1000 ]
  }
]
instance_group [
  {
    kind: KIND_GPU
  }
]

Run Triton:

docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -vpwd/models:/models nvcr.io/nvidia/tritonserver:20.07-py3 tritonserver --model-repository=/models --log-verbose=1 --strict-model-config=false

Run inference:

import numpy as np
import sys
import tritongrpcclient

url = 'localhost:8001'
model_name = 'od'

try:
    triton_client = tritongrpcclient.InferenceServerClient(
        url=url,
        verbose=True
    )
except Exception as e:
    print("channel creation failed: " + str(e))
    sys.exit()

inputs = []
outputs = []

im = np.zeros((3, 32, 32), dtype=np.float32)

inputs.append(tritongrpcclient.InferInput('inputs__0', im.shape, "FP32"))
inputs[0].set_data_from_numpy(im)

outputs.append(tritongrpcclient.InferRequestedOutput('boxes__0'))
outputs.append(tritongrpcclient.InferRequestedOutput('labels__1'))
outputs.append(tritongrpcclient.InferRequestedOutput('scores__2'))

results = triton_client.infer(model_name=model_name,
                              inputs=inputs,
                              outputs=outputs)

Running inference results in following error:

I0805 09:38:43.840524 1 libtorch_backend.cc:774] The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/torchvision/ops/boxes.py", line 50, in forward
    _18 = torch.slice(offsets, 0, 0, 9223372036854775807, 1)
    boxes_for_nms = torch.add(boxes, torch.unsqueeze(_18, 1), alpha=1)
    keep = __torch__.torchvision.ops.boxes.nms(boxes_for_nms, scores, iou_threshold, )
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _11 = keep
  return _11
  File "code/__torch__/torchvision/ops/boxes.py", line 91, in nms
    scores: Tensor,
    iou_threshold: float) -> Tensor:
  _42 = ops.torchvision.nms(boxes, scores, iou_threshold)
        ~~~~~~~~~~~~~~~~~~~ <--- HERE
  return _42

Traceback of TorchScript, original code (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torchvision/ops/boxes.py", line 41, in nms
        by NMS, sorted in decreasing order of scores
    """
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
           ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: Not compiled with GPU support

Expected behavior Triton can run inference on FasterRCNN model exported to TorchScript on GPU.

Issue Analytics

State:
Created 3 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

3reactions

lzha106commented, Aug 4, 2021

verified ‘nvcr.io/nvidia/tritonserver:20.10-py3’ works well.

0reactions

adamm123commented, Sep 16, 2020

I tried evaluating the model on nvcr.io/nvidia/tritonserver:20.08-py3. An error message is different but still it seems to be missing GPU support.

inference_server_1  | RuntimeError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. 'torchvision::nms' is only available for these backends: [CPU, BackendSelect, Named, Autograd, Profiler, Tracer, Autocast, Batched].
inference_server_1  | 
inference_server_1  | CPU: registered at /opt/pytorch/vision/torchvision/csrc/vision.cpp:59 [kernel]
inference_server_1  | BackendSelect: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
inference_server_1  | Named: registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
inference_server_1  | Autograd: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/VariableFallbackKernel.cpp:31 [backend fallback]
inference_server_1  | Profiler: registered at /tmp/pip-req-build-gk_ormv_/torch/csrc/autograd/profiler.cpp:677 [backend fallback]
inference_server_1  | Tracer: fallthrough registered at /tmp/pip-req-build-gk_ormv_/torch/csrc/jit/frontend/tracer.cpp:993 [backend fallback]
inference_server_1  | Autocast: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/autocast_mode.cpp:375 [backend fallback]
inference_server_1  | Batched: registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/BatchingRegistrations.cpp:149 [backend fallback]

Top Results From Across the Web

Torchvision ops not compiled with GPU support · Issue #1870

Running FasterRCNN model exported to TorchScript fails because of torch.ops.torchvision.nms is not compiled with GPU support.

Why am I getting RuntimeError: Not compiled with GPU ...

I am trying to run the code from: https://github.com/cvlab-stonybrook/ContactHands on Windows 10. It uses detectron2 and pytorch with CUDA.

nms — Torchvision main documentation

Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU). NMS iteratively removes lower scoring boxes which have an ...

Frequently Asked Questions — mmcv 1.3.16 documentation

“No module named 'mmcv.ops'”; “No module named 'mmcv._ext'”. ... “RuntimeError: nms is not compiled with GPU support”. This error is because your CUDA ......

torchvision

By default, GPU support is built if CUDA is found and torch.cuda.is_available() is true. It's possible to force building GPU support by setting...