Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: CUDA error: invalid device function (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:102)

See original GitHub issue

Thanks for your error report and we appreciate it a lot.

Checklist

I have searched related issues but cannot get the expected help.
The bug has not been fixed in the latest version.

Describe the bug After using “python setup.py develop” update the mmdetection this bug happened.

RuntimeError: CUDA error: invalid device function (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:102) File “/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/anchor_heads/rpn_head.py”, line 92, in get_bboxes_single proposals, _ = nms(proposals, cfg.nms_thr) File “/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_wrapper.py”, line 55, in nms inds = nms_cuda.nms(dets_th, iou_thr)

Reproduction

What command or script did you run?

python tools/train.py configs/***

Did you make any modifications on the code or config? Did you understand what you have modified? I just using my own dataset like coco but change the category number.
What dataset did you use? My own dateset. Environment
Please run python tools/collect_env.py to collect necessary environment infomation and paste it here. sys.platform: linux Python: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.0, V10.0.130 GPU 0,1,2: GeForce GTX 1080 Ti GCC: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 PyTorch: 1.3.1 PyTorch compiling details: PyTorch built with:

GCC 7.3
Intel® Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel® 64 architecture applications
Intel® MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.1
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.4.2 OpenCV: 4.1.2 MMCV: 0.2.15 MMDetection: 1.0rc1+2b8a5f8 MMDetection Compiler: GCC 7.4 MMDetection CUDA Compiler: 10.0

You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.) Conda Error traceback If applicable, paste the error trackback here.

Traceback (most recent call last):
  File "/media/wrc/0EB90E450EB90E45/tianchi_chongqin/mmdetection/tools/test_one_img.py", line 183, in <module>
    main()
  File "/media/wrc/0EB90E450EB90E45/tianchi_chongqin/mmdetection/tools/test_one_img.py", line 62, in main
    result = inference_detector(model, img_path)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/apis/inference.py", line 86, in inference_detector
    result = model(return_loss=False, rescale=True, **data)
  File "/home/wrc/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/detectors/base.py", line 140, in forward
    return self.forward_test(img, img_meta, **kwargs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/detectors/base.py", line 125, in forward_test
    return self.aug_test(imgs, img_metas, **kwargs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 414, in aug_test
    self.extract_feats(imgs), img_metas, self.test_cfg.rpn)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/detectors/test_mixins.py", line 41, in aug_test_rpn
    proposal_list = self.simple_test_rpn(x, img_meta, rpn_test_cfg)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/detectors/test_mixins.py", line 34, in simple_test_rpn
    proposal_list = self.rpn_head.get_bboxes(*proposal_inputs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
    return old_func(*args, **kwargs)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/anchor_heads/anchor_head.py", line 276, in get_bboxes
    scale_factor, cfg, rescale)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/models/anchor_heads/rpn_head.py", line 92, in get_bboxes_single
    proposals, _ = nms(proposals, cfg.nms_thr)
  File "/media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_wrapper.py", line 55, in nms
    inds = nms_cuda.nms(dets_th, iou_thr)
RuntimeError: CUDA error: invalid device function (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:102)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fbbc1590813 in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<__nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>, __nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> const&) + 0x7bb (0x7fbbc6dddb0b in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x53f4d52 (0x7fbbc6dd7d52 in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x53f5218 (0x7fbbc6dd8218 in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x1a12aeb (0x7fbbc33f5aeb in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #5: at::native::index(at::Tensor const&, c10::ArrayRef<at::Tensor>) + 0x44e (0x7fbbc33f091e in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x1f09fba (0x7fbbc38ecfba in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x39e41cd (0x7fbbc53c71cd in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #8: at::Tensor::index(c10::ArrayRef<at::Tensor>) const + 0xbb (0x7fbbc508df5b in /home/wrc/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #9: nms_cuda(at::Tensor, float) + 0x822 (0x7fbb9f6c25f8 in /media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #10: nms(at::Tensor const&, float) + 0xee (0x7fbb9f6b128e in /media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #11: <unknown function> + 0x3a6ed (0x7fbb9f6c06ed in /media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #12: <unknown function> + 0x37d9a (0x7fbb9f6bdd9a in /media/wrc/0EB90E450EB90E45/coco_train/mmdetection/mmdet/ops/nms/nms_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #13: _PyCFunction_FastCallDict + 0x154 (0x5634f4345c54 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #14: <unknown function> + 0x199abc (0x5634f43cdabc in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #15: _PyEval_EvalFrameDefault + 0x30a (0x5634f43f075a in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #16: <unknown function> + 0x192e66 (0x5634f43c6e66 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #17: <unknown function> + 0x193e73 (0x5634f43c7e73 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #18: <unknown function> + 0x199b95 (0x5634f43cdb95 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #19: _PyEval_EvalFrameDefault + 0x30a (0x5634f43f075a in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #20: <unknown function> + 0x192e66 (0x5634f43c6e66 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #21: <unknown function> + 0x193e73 (0x5634f43c7e73 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #22: <unknown function> + 0x199b95 (0x5634f43cdb95 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #23: _PyEval_EvalFrameDefault + 0x30a (0x5634f43f075a in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #24: PyEval_EvalCodeEx + 0x966 (0x5634f43c8ff6 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #25: <unknown function> + 0x1958e6 (0x5634f43c98e6 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #26: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #27: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #28: <unknown function> + 0x19329e (0x5634f43c729e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #29: _PyFunction_FastCallDict + 0x1be (0x5634f43c837e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #30: _PyObject_FastCallDict + 0x26f (0x5634f434601f in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #31: _PyObject_Call_Prepend + 0x63 (0x5634f434aaa3 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #32: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #33: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #34: <unknown function> + 0x193c5b (0x5634f43c7c5b in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #35: <unknown function> + 0x199b95 (0x5634f43cdb95 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #36: _PyEval_EvalFrameDefault + 0x30a (0x5634f43f075a in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #37: <unknown function> + 0x19329e (0x5634f43c729e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #38: <unknown function> + 0x193ed6 (0x5634f43c7ed6 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #39: <unknown function> + 0x199b95 (0x5634f43cdb95 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #40: _PyEval_EvalFrameDefault + 0x30a (0x5634f43f075a in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #41: <unknown function> + 0x192e66 (0x5634f43c6e66 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #42: _PyFunction_FastCallDict + 0x3d8 (0x5634f43c8598 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #43: _PyObject_FastCallDict + 0x26f (0x5634f434601f in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #44: _PyObject_Call_Prepend + 0x63 (0x5634f434aaa3 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #45: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #46: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #47: <unknown function> + 0x192e66 (0x5634f43c6e66 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #48: _PyFunction_FastCallDict + 0x3d8 (0x5634f43c8598 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #49: _PyObject_FastCallDict + 0x26f (0x5634f434601f in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #50: _PyObject_Call_Prepend + 0x63 (0x5634f434aaa3 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #51: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #52: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #53: PyEval_EvalCodeEx + 0x329 (0x5634f43c89b9 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #54: <unknown function> + 0x1958e6 (0x5634f43c98e6 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #55: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #56: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #57: <unknown function> + 0x19329e (0x5634f43c729e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #58: _PyFunction_FastCallDict + 0x3d8 (0x5634f43c8598 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #59: _PyObject_FastCallDict + 0x26f (0x5634f434601f in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #60: _PyObject_Call_Prepend + 0x63 (0x5634f434aaa3 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #61: PyObject_Call + 0x3e (0x5634f4345a5e in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #62: _PyEval_EvalFrameDefault + 0x19e7 (0x5634f43f1e37 in /home/wrc/.conda/envs/mm2/bin/python3.6)
frame #63: <unknown function> + 0x192e66 (0x5634f43c6e66 in /home/wrc/.conda/envs/mm2/bin/python3.6)

Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Issue Analytics

State:
Created 4 years ago
Comments:9 (3 by maintainers)

Top GitHub Comments

2reactions

hellockcommented, Jan 20, 2020

The reason for those problems is clear. Please make sure that:

You installed the pytorch prebuilt with the same CUDA version as your local installed CUDA (e.g., the one in /usr/local/cuda).
You compile mmdet with the same CUDA version as your runtime.

It is possible that it works well before, and you upgraded pytorch then it broke. If you did not specify the CUDA version when installing pytorch, the default CUDA version can be different for different pytorch versions, e.g., PyTorch 1.1 is prebuilt with 10.0 and PyTorch 1.1 is prebuilt with 10.1 by default.

1reaction

MyLtYkRiTiKcommented, Feb 28, 2020

I have the same problem. After reinstall of cuda, pytorch and so on, nothing work with the same mistakes. I recompile mmdet too. Then I delete existing mmdetection folder, download it again and compile.

If you need there is video and blogpost how to upgrade cuda: https://www.youtube.com/watch?v=FhR8hL-xNDk https://medium.com/@exesse/cuda-10-1-installation-on-ubuntu-18-04-lts-d04f89287130

Top Results From Across the Web

nvidia_deeplearningexamples_t...

nvidia_deeplearningexamples_tacotron2 :RuntimeError: CUDA error: invalid device function ... Set up runtime: python3 and GPU. Run the code step by ...

cudaLaunchKernel returned status 98: invalid device function

Hi, We tried to run an OpenACC code on an IBM power9+A100 system but got error Line 144: cudaLaunchKernel returned status 98: invalid...

RuntimeError: CUDA error: invalid device function - C++

hi, i am using a c++/cuda custom extension. when running the extension, i get this error: RuntimeError: CUDA error: invalid device function ......

NVIDIA CUDA Library: cudaError

cudaSuccess, The API call returned with no errors. ... cudaErrorInvalidDeviceFunction, The requested device function does not exist or is not compiled for ...