NVRTCError: NVRTC_ERROR_COMPILATION with `cupy.cuda.compile_with_cache`
See original GitHub issue🐛 Bug
I got the following error when I compile the model with cupy.cuda.compile_with_cache
to jit.
NVRTCError: NVRTC_ERROR_COMPILATION (6)
During handling of the above exception, another exception occurred:
CompileException Traceback (most recent call last)
cupy/util.pyx in cupy.util.memoize.decorator.ret()
/usr/local/lib/python3.7/dist-packages/cupy/cuda/compiler.py in compile(self, options)
440 except nvrtc.NVRTCError:
441 log = nvrtc.getProgramLog(self.ptr)
--> 442 raise CompileException(log, self.src, self.name, options, 'nvrtc')
443
444
CompileException: /tmp/tmpan1ut480/3b7c153ce98d06488f1cbac8793f6dff_2.cubin.cu(16): error: identifier "tensor" is undefined
1 error detected in the compilation of "/tmp/tmpan1ut480/3b7c153ce98d06488f1cbac8793f6dff_2.cubin.cu".
To Reproduce
This is a colab to reproduce the error. https://colab.research.google.com/drive/1WDRCN6wPIAsl5tBFKfne0ABN49estM9P?usp=sharing
This is a minimum code.
@cupy.util.memoize(for_each_device=True)
def cupy_launch(strFunction, strKernel):
return cupy.cuda.compile_with_cache(strKernel).get_function(strFunction)
kernel_Correlation_rearrange = " .... "
import torch
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
def forward(self, x_warp_after, x_cond):
cupy_launch('kernel_Correlation_rearrange', cupy_kernel('kernel_Correlation_rearrange', {
'intStride': 1,
'input': x_warp_after,
'output': x_cond
}))(
)
return x_warp_after, x_cond
net = Net().cuda()
input1 = torch.randn([1, 256, 8, 6]).cuda()
input2 = torch.randn([1, 256, 8, 6]).cuda()
trace_model = torch.jit.trace(net, [input1, input2])
Expected behavior
I think the above error occurs when I use cupy.cuda.compile_with_cache.
Environment
- CuPy version: cupy-cuda101==7.4.0
- CUDA/cuDNN version: 11.0.221
- PyTorch Version (e.g., 1.0): 1.8.1+cu101
- OS (e.g., Linux): Ubuntu 18.04.5 LTS (x86_64)
- How you installed PyTorch (
conda
,pip
, source): pip - Build command you used (if compiling from source): no
- Python version: 3.7 (64-bit runtime)
- GPU models and configuration: GPU 0: Tesla T4
- Any other relevant information:
Additional context
I opened an issue in the pytorch repository before, but I realized that the problem is not a pytorch issue, but a cupy issue.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
cupy.cuda.nvrtc.NVRTCError ...
Hi, I am wondering why running with cupy encounters the following error: cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6) The detail ...
Read more >How to jit compile with `cupy.cuda.compile_with_cache`
I got the following error when I compile the model with cupy.cuda.compile_with_cache to jit. NVRTCError: NVRTC_ERROR_COMPILATION (6) During ...
Read more >Environment variables — CuPy 11.4.0 documentation
Here are the environment variables that CuPy uses at runtime. CUDA_PATH#. Path to the directory containing CUDA. The parent of the directory containing...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This is neither PyTorch’s nor CuPy’s bug, but rather an issue in the way you did string processing to generate your kernel. Notice this error:
It is a common C/C++ error telling your the definition for an identifier
tensor
is missing. You should check how that identifier entered the code string. CuPy provides some env variables, and the one you need to help you debug the code generation is eitherCUPY_CACHE_SAVE_CUDA_SOURCE
orCUPY_DUMP_CUDA_SOURCE_ON_ERROR
.By the way, it is best to not use
cupy.cuda.compile_with_cache()
because it is subject to change without notification (it’s considered internal API AFAIK). There is a public APIcupy.RawModule
for exactly this need (see tutorial).Thanks to @kmaehashi 's advice, this problem has been solved.
The problem is that when converting to jit, the int type becomes a tensor type. I solved the problem by rewriting the following code.