question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda

See original GitHub issue

Attempting to forward inference the panoptic fpn model results in a CUDA error.

To Reproduce

Attempting to run a predictor using the model panoptic_fpn_R_101_dconv_cascade_gn_3x.yaml.

The following error is produced:

error in deformable_im2col: invalid device function
... < repeated ~30 times> ...
File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/app/detectron2/detectron2/engine/defaults.py", line 176, in __call__
    predictions = self.model([inputs])[0]
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/meta_arch/panoptic_fpn.py", line 98, in forward
    images, features, proposals, gt_instances
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 97, in forward
    pred_instances = self._forward_box(features_list, proposals)
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 112, in _forward_box
    head_outputs.append(self._run_stage(features, proposals, k))
  File "/app/detectron2/detectron2/modeling/roi_heads/cascade_rcnn.py", line 203, in _run_stage
    box_features = self.box_pooler(features, [x.proposal_boxes for x in proposals])
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/modeling/poolers.py", line 192, in forward
    output[inds] = pooler(x_level, pooler_fmt_boxes_level)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/app/detectron2/detectron2/layers/roi_align.py", line 95, in forward
    input, rois, self.output_size, self.spatial_scale, self.sampling_ratio, self.aligned
  File "/app/detectron2/detectron2/layers/roi_align.py", line 20, in forward
    input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: invalid device function (ROIAlign_forward_cuda at /app/detectron2/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:359)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7ffa5c402687 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xa37 (0x7ffa0653b6f5 in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #2: ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0xbc (0x7ffa064c9fdc in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x5961a (0x7ffa064db61a in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x5971e (0x7ffa064db71e in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x53ca0 (0x7ffa064d5ca0 in /app/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #12: THPFunction_apply(_object*, _object*) + 0x9ff (0x7ffa5d63dacf in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

Environment

---------------------  -------------------------------------------------------------------
Python                 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0]
Detectron2 Compiler    GCC 5.4
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.3.0
PyTorch Debug Build    False
CUDA available         True
GPU 0,1                GeForce GTX 1080 Ti
Pillow                 6.2.0
cv2                    3.4.4
---------------------  -------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:21

github_iconTop GitHub Comments

30reactions
ppwwyyxxcommented, Oct 15, 2019

I was able to reproduce the same error when I use the wrong version of cuda.

What I did: I install pytorch from conda install pytorch torchvision cudatoolkit=10.1 -c pytorch, however my local cuda runtime and nvcc are in 10.0. In this case, I can observe the same error. Please check whether your cuda version is correct.

5reactions
ppwwyyxxcommented, Oct 29, 2019

It seems that mismatched NVCC vs CUDA Runtime version is the root cause. Closing but feel free to reopen if this does not solve your issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RuntimeError: CUDA error: invalid device function - C++
i am using a c++/cuda custom extension. when running the extension, i get this error: RuntimeError: CUDA error: invalid device function.
Read more >
CUDA - invalid device function, how to know [architecture ...
I am getting the following error when running the ...
Read more >
cudaLaunchKernel returned status 98: invalid device function
Where I've seen the “invalid device function” error is typically due to a mismatch in the CUDA version or target device. So given...
Read more >
RuntimeError: CUDA error: invalid device ordinal issue with ...
I am trying to run the basic CIRFAR example, but keep running into errors when using ray train + ray tune, here is...
Read more >
NVIDIA CUDA Library: cudaError
cudaSuccess, The API call returned with no errors. ... cudaErrorInvalidDeviceFunction, The requested device function does not exist or is not compiled for ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found