Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Libtorch cpp interface tracing error

See original GitHub issue

Instructions To Reproduce the 🐛 Bug:

what changes you made (git diff) or what code you wrote

I use the libtorch cpp interface to load the exported torchscript model. Concrete code as below:

#include <ATen/ATen.h>
#include <torch/script.h>
#include <torch/torch.h>


int main() {
  torch::DeviceType device_type;
  device_type = torch::kCPU;

  torch::jit::script::Module module;
  try {
    std::cout << "Loading model" << std::endl;
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load("detr_resnet50.pt");
    std::cout << "Model loaded" << std::endl;
  } catch (const torch::Error& e) {
    std::cout << "error loading the model" << std::endl;
    return -1;
  } catch (const std::exception& e) {
    std::cout << "Other error: " << e.what() << std::endl;
    return -1;
  }

  // TorchScript models require a List[IValue] as input
  std::vector<torch::jit::IValue> inputs;

  // Demonet accepts a List[Tensor] as main input
  std::vector<torch::Tensor> images;
  images.push_back(torch::rand({3, 200, 200}));
  images.push_back(torch::rand({3, 256, 275}));

  inputs.push_back(images);
  auto output = module.forward(inputs);

  std::cout << "ok" << std::endl;
  std::cout << "output" << output << std::endl;
  return 0;
}

And my CMakeLists.txt

cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
project(test_tracing)

find_package(Torch REQUIRED)

# This due to some headers importing Python.h
find_package(Python3 COMPONENTS Development)

add_executable(${CMAKE_PROJECT_NAME} test_tracing.cpp)
target_compile_features(test_tracing PUBLIC cxx_range_for)

target_link_libraries(
  ${CMAKE_PROJECT_NAME}
  ${TORCH_LIBRARIES}
  Python3::Python
)

# set C++14 to compile
set_property(TARGET test_tracing PROPERTY CXX_STANDARD 14)

what exact command you run:

mkdir build && cd build
cmake .. -DTorch_DIR=$TORCH_PATH/share/cmake/Torch
make
./test_tracing

what you observed (including full logs):

Loading model
Model loaded
terminate called after throwing an instance of 'c10::Error'
  what():  forward() Expected a value of type '__torch__.util.misc.NestedTensor' for argument 'samples' but instead found type 'List[Tensor]'.
Position: 1
Declaration: forward(__torch__.models.detr.DETR self, __torch__.util.misc.NestedTensor samples) -> (Dict(str, Tensor))
Exception raised from checkArg at /pytorch/aten/src/ATen/core/function_schema_inl.h:193 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f1bb76cb572 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0xb6d218 (0x7f1ba7b94218 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_cpu.so)
frame #2: <unknown function> + 0xb72017 (0x7f1ba7b99017 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_cpu.so)
frame #3: torch::jit::GraphFunction::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::string, c10::IValue, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, c10::IValue> > > const&) + 0x2d (0x7f1ba9c23a0d in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::jit::Method::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >, std::unordered_map<std::string, c10::IValue, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, c10::IValue> > > const&) + 0x146 (0x7f1ba9c321a6 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_cpu.so)
frame #5: torch::jit::Module::forward(std::vector<c10::IValue, std::allocator<c10::IValue> >) + 0xf9 (0x5618878b0ca5 in ./test_tracing)
frame #6: main + 0x282 (0x5618878ac14d in ./test_tracing)
frame #7: __libc_start_main + 0xe7 (0x7f1b6a4abb97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #8: _start + 0x2a (0x5618878ab60a in ./test_tracing)

please simplify the steps as much as possible so they do not require additional resources to run, such as a private dataset.

Expected behavior:

The libtorch cpp interface can do inference 😃

Environment:

Provide your environment information using the following command:

Collecting environment information...
PyTorch version: 1.7.0.dev20200912
Is debug build: False
CUDA used to build PyTorch: 10.2

OS: Ubuntu 18.04.3 LTS (x86_64)
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Clang version: 11.0.0 (git@gitee.com:JoelYang/llvm-project.git b3fb40b3a3c1fb7ac094eda50762624baad37552)
CMake version: version 3.18.0

Python version: 3.6 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration:
GPU 0: Tesla P100-SXM2-16GB

Nvidia driver version: 384.81
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.0.3
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.0.3

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.0.dev20200912
[pip3] torchvision==0.8.0.dev20200914
[conda] Could not collect

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:11 (11 by maintainers)

Top GitHub Comments

3reactions

fmassacommented, Sep 18, 2020

Hi,

In order to use the code in C++ without changes, you would need to wrap your input in a NestedTensor, which is what torchscript expects.

Here are two options:

1-

Wrap your DETR model in a dummy model that takes a List[Tensor] as follows

class WrappedDETR(nn.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model

    def forward(self, inputs : List[Tensor]):
        return self.model(inputs)

This should be the easiest way to do it I think.

2 -

Export nested_tensor_from_tensor_list as a separate torchscript object and use it to convert your list of tensors into a NestedTensor. This adds a bit more complexity on the C++ side of things.

I’m not sure if there is a simpler way to construct the NestedTensor object from C++, but I believe those two options should work.

For more context on the existence of NestedTensor, I suggest you check https://github.com/facebookresearch/detr/issues/116. We would like to keep the NestedTensor abstractions in the constructor for now, to avoid potentially breaking changes on user code.

I believe I’ve answered your question, and as such I’m closing this issue, but let us know if you have further questions.

1reaction

fmassacommented, Sep 19, 2020

Thanks for the detailed explanation, that was indeed my question.

I now understand the problem: in https://github.com/facebookresearch/detr/blob/5e66b4cd15b2b182da347103dd16578d28b49d69/util/misc.py#L311 We need to set the type annotation to be List[Tensor], otherwise it defaults to Tensor.

Can you try this out, and if it works send a PR fixing it?