cuTENSOR 1.6: failure in math operations on some arrays with singleton dimensions
See original GitHub issueDescription
I observed one test failure in cuCIM today when running the test suite using CuPy 11, but only when CUPY_ACCELERATORS
contains “cutensor” . I have not yet gone back and tried with older CuPy. I also have not tested older versions of cuTENSOR
To Reproduce
A minimal reproducer is
import cupy as cp
a = cp.ones((4, 1, 5), dtype=float)
b = a.copy()
a - b
which gives the error:
CuTensorError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 a - b
File ~/src/public/cupy/cupy/_core/core.pyx:1271, in cupy._core.core._ndarray_base.__sub__()
File ~/src/public/cupy/cupy/_core/_kernel.pyx:1259, in cupy._core._kernel.ufunc.__call__()
File ~/src/public/cupy/cupy/cutensor.pyx:905, in cupy.cutensor._try_elementwise_binary_routine()
File ~/src/public/cupy/cupy/cutensor.pyx:402, in cupy.cutensor._elementwise_binary_impl()
File ~/src/public/cupy/cupy_backends/cuda/libs/cutensor.pyx:507, in cupy_backends.cuda.libs.cutensor.elementwiseBinary()
File ~/src/public/cupy/cupy_backends/cuda/libs/cutensor.pyx:260, in cupy_backends.cuda.libs.cutensor.check_status()
Installation
Source (pip install cupy
)
Environment
OS : Linux-5.14.0-1045-oem-x86_64-with-glibc2.31
Python Version : 3.9.13
CuPy Version : 11.0.0
CuPy Platform : NVIDIA CUDA
NumPy Version : 1.21.6
SciPy Version : 1.8.1
Cython Build Version : 0.29.30
Cython Runtime Version : 0.29.30
CUDA Root : /usr/local/cuda
nvcc PATH : /usr/local/cuda/bin/nvcc
CUDA Build Version : 11070
CUDA Driver Version : 11070
CUDA Runtime Version : 11070
cuBLAS Version : (available)
cuFFT Version : 10702
cuRAND Version : 10210
cuSOLVER Version : (11, 3, 5)
cuSPARSE Version : (available)
NVRTC Version : (11, 7)
Thrust Version : 101500
CUB Build Version : 101500
Jitify Build Version : 4a37de0
cuDNN Build Version : 8401
cuDNN Version : 8401
NCCL Build Version : 21304
NCCL Runtime Version : 21304
cuTENSOR Version : 10600
cuSPARSELt Build Version : None
Device 0 Name : NVIDIA RTX A6000
Device 0 Compute Capability : 86
Device 0 PCI Bus ID : 0000:17:00.0
Device 1 Name : Quadro P620
Device 1 Compute Capability : 61
Device 1 PCI Bus ID : 0000:65:00.0
Additional Information
This operation only seems to fail for the example above when the “singleton” dimension is present and not in either the first or last position of the shape. For example the following shapes all work fine:
(1, 4, 5)
(5, 4, 1)
(2, 3, 4)
but the following examples with a singleton in one of the center positions also has the same error
(2, 3, 1, 5)
(2, 1, 3, 5)
The failure is not specific to subtraction, but also occurs for other operations (+, / , *)
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:7 (7 by maintainers)
Top Results From Across the Web
cuTENSOR Functions
descA – [in] A descriptor that holds the information about the data type, modes, and strides of A. modeA – [in] Array (in...
Read more >Remove singleton dimensions from multidimensional signal
A singleton dimension is any dimension whose size is one. The Squeeze block operates only on signals whose number of dimensions is greater...
Read more >Github Com Cupy Cupy Issues 4902
cuTENSOR 1.6 : failure in math operations on some arrays with singleton. I observed one test failure in cuCIM today when running the...
Read more >v9.2.0 PDF
If you are using certain versions of conda, it may fail to build CuPy with ... CuPy is a GPU array backend that...
Read more >LmodWeb - HPC High Performance Computing: Home
Skylake. Skylake (SKL) is Intel's microarchitecture that was launched in August 2015 succeeding the Broadwell. Distributions: (1) Rocky/8.4. Packages: 324.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Ah, I see that the install docs don’t yet list 1.6.0 as supported. I had that version because I am using recent Ubuntu 20.04 packages.
I just tried with a pip install of cupy 11.x and cutensor 1.5.0 obtained via
and do not see the issue there.
closed https://docs.nvidia.com/cuda/cutensor/release_notes.html#cutensor-v1-6-1