question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Torchvision ops not compiled with GPU support

See original GitHub issue

Description Running FasterRCNN model exported to TorchScript fails because of torch.ops.torchvision.nms is not compiled with GPU support.

Triton Information What version of Triton are you using? nvcr.io/nvidia/tritonserver:20.07-py3

Are you using the Triton container or did you build it yourself? container

To Reproduce

  1. Export model using nvcr.io/nvidia/pytorch:20.07-py3 container:

Faster-RCNN model is wrapped so inputs and output are compatible with Triton.

import torch
import torchvision


class FasterRCNNWrapper(torch.nn.Module):

    def __init__(self):
        super().__init__()

        self.model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

    def forward(self, x):
        losses, detections = self.model([x])
        return detections[0]['boxes'], detections[0]['labels'], detections[0]['scores']


model = FasterRCNNWrapper()

script = torch.jit.script(model)
script.save('model.pt')

  1. Prepare model repository:

I’m using following dir structure:

models:
    od:
        0:
            model.pt
        config.pbtxt

config.pbtxt

name: "od"
platform: "pytorch_libtorch"
input [
  {
    name: "inputs__0"
    data_type: TYPE_FP32
    dims: [ 3, -1, -1 ]
  }
]
output [
  {
    name: "boxes__0"
    data_type: TYPE_FP32
    dims: [ 1000, 4  ]
  },
  {
    name: "labels__1"
    data_type: TYPE_INT64
    dims: [ 1000 ]
  },
  {
    name: "scores__2"
    data_type: TYPE_FP32
    dims: [ 1000 ]
  }
]
instance_group [
  {
    kind: KIND_GPU
  }
]
  1. Run Triton:

docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -vpwd/models:/models nvcr.io/nvidia/tritonserver:20.07-py3 tritonserver --model-repository=/models --log-verbose=1 --strict-model-config=false

  1. Run inference:
import numpy as np
import sys
import tritongrpcclient

url = 'localhost:8001'
model_name = 'od'

try:
    triton_client = tritongrpcclient.InferenceServerClient(
        url=url,
        verbose=True
    )
except Exception as e:
    print("channel creation failed: " + str(e))
    sys.exit()

inputs = []
outputs = []

im = np.zeros((3, 32, 32), dtype=np.float32)

inputs.append(tritongrpcclient.InferInput('inputs__0', im.shape, "FP32"))
inputs[0].set_data_from_numpy(im)

outputs.append(tritongrpcclient.InferRequestedOutput('boxes__0'))
outputs.append(tritongrpcclient.InferRequestedOutput('labels__1'))
outputs.append(tritongrpcclient.InferRequestedOutput('scores__2'))

results = triton_client.infer(model_name=model_name,
                              inputs=inputs,
                              outputs=outputs)

Running inference results in following error:

I0805 09:38:43.840524 1 libtorch_backend.cc:774] The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/torchvision/ops/boxes.py", line 50, in forward
    _18 = torch.slice(offsets, 0, 0, 9223372036854775807, 1)
    boxes_for_nms = torch.add(boxes, torch.unsqueeze(_18, 1), alpha=1)
    keep = __torch__.torchvision.ops.boxes.nms(boxes_for_nms, scores, iou_threshold, )
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _11 = keep
  return _11
  File "code/__torch__/torchvision/ops/boxes.py", line 91, in nms
    scores: Tensor,
    iou_threshold: float) -> Tensor:
  _42 = ops.torchvision.nms(boxes, scores, iou_threshold)
        ~~~~~~~~~~~~~~~~~~~ <--- HERE
  return _42

Traceback of TorchScript, original code (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torchvision/ops/boxes.py", line 41, in nms
        by NMS, sorted in decreasing order of scores
    """
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
           ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: Not compiled with GPU support

Expected behavior Triton can run inference on FasterRCNN model exported to TorchScript on GPU.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
lzha106commented, Aug 4, 2021

verified ‘nvcr.io/nvidia/tritonserver:20.10-py3’ works well.

0reactions
adamm123commented, Sep 16, 2020

I tried evaluating the model on nvcr.io/nvidia/tritonserver:20.08-py3. An error message is different but still it seems to be missing GPU support.

inference_server_1  | RuntimeError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. 'torchvision::nms' is only available for these backends: [CPU, BackendSelect, Named, Autograd, Profiler, Tracer, Autocast, Batched].
inference_server_1  | 
inference_server_1  | CPU: registered at /opt/pytorch/vision/torchvision/csrc/vision.cpp:59 [kernel]
inference_server_1  | BackendSelect: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
inference_server_1  | Named: registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
inference_server_1  | Autograd: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/core/VariableFallbackKernel.cpp:31 [backend fallback]
inference_server_1  | Profiler: registered at /tmp/pip-req-build-gk_ormv_/torch/csrc/autograd/profiler.cpp:677 [backend fallback]
inference_server_1  | Tracer: fallthrough registered at /tmp/pip-req-build-gk_ormv_/torch/csrc/jit/frontend/tracer.cpp:993 [backend fallback]
inference_server_1  | Autocast: fallthrough registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/autocast_mode.cpp:375 [backend fallback]
inference_server_1  | Batched: registered at /tmp/pip-req-build-gk_ormv_/aten/src/ATen/BatchingRegistrations.cpp:149 [backend fallback]

Read more comments on GitHub >

github_iconTop Results From Across the Web

Torchvision ops not compiled with GPU support · Issue #1870
Running FasterRCNN model exported to TorchScript fails because of torch.ops.torchvision.nms is not compiled with GPU support.
Read more >
Why am I getting RuntimeError: Not compiled with GPU ...
I am trying to run the code from: https://github.com/cvlab-stonybrook/ContactHands on Windows 10. It uses detectron2 and pytorch with CUDA.
Read more >
nms — Torchvision main documentation
Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU). NMS iteratively removes lower scoring boxes which have an ...
Read more >
Frequently Asked Questions — mmcv 1.3.16 documentation
“No module named 'mmcv.ops'”; “No module named 'mmcv._ext'”. ... “RuntimeError: nms is not compiled with GPU support”. This error is because your CUDA ......
Read more >
torchvision
By default, GPU support is built if CUDA is found and torch.cuda.is_available() is true. It's possible to force building GPU support by setting...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found