Torchvision ops not compiled with GPU support
See original GitHub issueDescription
Running FasterRCNN model exported to TorchScript fails because of torch.ops.torchvision.nms
is not compiled with GPU support.
Triton Information
What version of Triton are you using?
nvcr.io/nvidia/tritonserver:20.07-py3
Are you using the Triton container or did you build it yourself?
container
To Reproduce
- Export model using
nvcr.io/nvidia/pytorch:20.07-py3
container:
Faster-RCNN model is wrapped so inputs and output are compatible with Triton.
import torch
import torchvision
class FasterRCNNWrapper(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
def forward(self, x):
losses, detections = self.model([x])
return detections[0]['boxes'], detections[0]['labels'], detections[0]['scores']
model = FasterRCNNWrapper()
script = torch.jit.script(model)
script.save('model.pt')
- Prepare model repository:
I’m using following dir structure:
models:
od:
0:
model.pt
config.pbtxt
config.pbtxt
name: "od"
platform: "pytorch_libtorch"
input [
{
name: "inputs__0"
data_type: TYPE_FP32
dims: [ 3, -1, -1 ]
}
]
output [
{
name: "boxes__0"
data_type: TYPE_FP32
dims: [ 1000, 4 ]
},
{
name: "labels__1"
data_type: TYPE_INT64
dims: [ 1000 ]
},
{
name: "scores__2"
data_type: TYPE_FP32
dims: [ 1000 ]
}
]
instance_group [
{
kind: KIND_GPU
}
]
- Run Triton:
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v
pwd/models:/models nvcr.io/nvidia/tritonserver:20.07-py3 tritonserver --model-repository=/models --log-verbose=1 --strict-model-config=false
- Run inference:
import numpy as np
import sys
import tritongrpcclient
url = 'localhost:8001'
model_name = 'od'
try:
triton_client = tritongrpcclient.InferenceServerClient(
url=url,
verbose=True
)
except Exception as e:
print("channel creation failed: " + str(e))
sys.exit()
inputs = []
outputs = []
im = np.zeros((3, 32, 32), dtype=np.float32)
inputs.append(tritongrpcclient.InferInput('inputs__0', im.shape, "FP32"))
inputs[0].set_data_from_numpy(im)
outputs.append(tritongrpcclient.InferRequestedOutput('boxes__0'))
outputs.append(tritongrpcclient.InferRequestedOutput('labels__1'))
outputs.append(tritongrpcclient.InferRequestedOutput('scores__2'))
results = triton_client.infer(model_name=model_name,
inputs=inputs,
outputs=outputs)
Running inference results in following error:
I0805 09:38:43.840524 1 libtorch_backend.cc:774] The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/__torch__/torchvision/ops/boxes.py", line 50, in forward
_18 = torch.slice(offsets, 0, 0, 9223372036854775807, 1)
boxes_for_nms = torch.add(boxes, torch.unsqueeze(_18, 1), alpha=1)
keep = __torch__.torchvision.ops.boxes.nms(boxes_for_nms, scores, iou_threshold, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_11 = keep
return _11
File "code/__torch__/torchvision/ops/boxes.py", line 91, in nms
scores: Tensor,
iou_threshold: float) -> Tensor:
_42 = ops.torchvision.nms(boxes, scores, iou_threshold)
~~~~~~~~~~~~~~~~~~~ <--- HERE
return _42
Traceback of TorchScript, original code (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/torchvision/ops/boxes.py", line 41, in nms
by NMS, sorted in decreasing order of scores
"""
return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: Not compiled with GPU support
Expected behavior Triton can run inference on FasterRCNN model exported to TorchScript on GPU.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top GitHub Comments
verified ‘nvcr.io/nvidia/tritonserver:20.10-py3’ works well.
I tried evaluating the model on
nvcr.io/nvidia/tritonserver:20.08-py3
. An error message is different but still it seems to be missing GPU support.