Torch-TensorRT Integration
See original GitHub issue🚀 The feature
pytorch-tensorrt hit release 1.0 (is actually 1.1 right now), but most of the models available are not out of the box convertable to it.
Motivation, pitch
Read feature description
Alternatives
Alternatives would be to convert to onnx, and then convert to tensorrt, which is exactly what torch-trt tries to avoid; This would also require more work, because somebody would have to make sure that models are onnx compatible and tensort-trt compatible, with the latter being a torchscript-supporting runtime.
Additional context
AFAIK, from some quick tests like:
module = fcos_resnet50_fpn()
generalized_rcnn_transform_max_size = module.transform.max_size
inputs = torch_trt.Input(
min_shape=[1, 3, 224, 224],
opt_shape=[1, 3, 1080, 1920],
max_shape=[
1,
3,
generalized_rcnn_transform_max_size,
generalized_rcnn_transform_max_size,
],
)
precisions = {torch.half} # doesn't really matter
trt_module = torch_trt.compile(
module=module, inputs=[inputs], enabled_precisions=precisions
)
And looking into some blocking issue from tensorrt:
… the first thing to do would be to remove self/state (self.a = b
) mutations, on the forward
method, and… almost everything will work out of the box?
For reference, from a quick attempt to port some fcos model, I found 2 places where this happens. I am not sure if this is required.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (2 by maintainers)
@ntakouris Thanks a lot for the feedback.
I would be happy if we can make the models TensortRT compatible as long as it doesn’t involve adding extra hacks and workarounds. An example of such workaround is at https://github.com/pytorch/vision/pull/6120
TorchVision’s codebase often has to apply numerous workarounds (most of the nusty) to ensure our models are JIT-scriptable and FX-traceable. Add ONNX to the mix (for limited models) and then you got a code-base that very hard to maintain. Often these workarounds conflict one another, requiring even more hacks to make things work. So as long as we don’t introduce more of them, I’m happy to investigate with you making the models compatible. Note though that I have extremely limited knowledge of TensorRT, so if we were to make things compatible that needs to be driven by someone who has good knowledge. In addition, if we are to go that path it should be understood that the chance of not merging the PR is high due to the aforementioned caveats. So if a contributor would be comfortable with this, we could schedule to provide some support on the reviewing side.
Concerning the points on the mutations, I think they are valid. To the best of my knowledge the was an alternative API we could use to warn once but didn’t work properly so we had to do this workaround. If there is a better solution, we can consider adopting it. Concerning anchor utils, the situation is more complex because unfortunately the method
set_cell_anchors()
is public so clipping it would be a BC-breaking change. Thankfully the whole Detection area is Beta which gives us some flexibility. So if we had a PR that rewrites the anchor to avoid having side-effects (which I agree is a bad practice) then we can discuss finding ways to merge the changes. Perhaps we should take the step to deprecate now so that we can change in the next release. cc @NicolasHug to give his input here.Not much to add to @datumbox’s answer, with which I fully agree. My main concern is maintenance cost: supporting the cross-product of torchscript, FX and ONNX induces a very high maintenance cost for us, and adding support to RT would complicate this even further. This is something we might need to discuss a bit more, for us to be certain that the benefits are worth the additional cost.