question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cuTENSOR 1.6: failure in math operations on some arrays with singleton dimensions

See original GitHub issue

Description

I observed one test failure in cuCIM today when running the test suite using CuPy 11, but only when CUPY_ACCELERATORS contains “cutensor” . I have not yet gone back and tried with older CuPy. I also have not tested older versions of cuTENSOR

To Reproduce

A minimal reproducer is

import cupy as cp
a = cp.ones((4, 1, 5), dtype=float)
b = a.copy()
a - b

which gives the error:

CuTensorError                             Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 a - b

File ~/src/public/cupy/cupy/_core/core.pyx:1271, in cupy._core.core._ndarray_base.__sub__()

File ~/src/public/cupy/cupy/_core/_kernel.pyx:1259, in cupy._core._kernel.ufunc.__call__()

File ~/src/public/cupy/cupy/cutensor.pyx:905, in cupy.cutensor._try_elementwise_binary_routine()

File ~/src/public/cupy/cupy/cutensor.pyx:402, in cupy.cutensor._elementwise_binary_impl()

File ~/src/public/cupy/cupy_backends/cuda/libs/cutensor.pyx:507, in cupy_backends.cuda.libs.cutensor.elementwiseBinary()

File ~/src/public/cupy/cupy_backends/cuda/libs/cutensor.pyx:260, in cupy_backends.cuda.libs.cutensor.check_status()

Installation

Source (pip install cupy)

Environment

OS                           : Linux-5.14.0-1045-oem-x86_64-with-glibc2.31
Python Version               : 3.9.13
CuPy Version                 : 11.0.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.21.6
SciPy Version                : 1.8.1
Cython Build Version         : 0.29.30
Cython Runtime Version       : 0.29.30
CUDA Root                    : /usr/local/cuda
nvcc PATH                    : /usr/local/cuda/bin/nvcc
CUDA Build Version           : 11070
CUDA Driver Version          : 11070
CUDA Runtime Version         : 11070
cuBLAS Version               : (available)
cuFFT Version                : 10702
cuRAND Version               : 10210
cuSOLVER Version             : (11, 3, 5)
cuSPARSE Version             : (available)
NVRTC Version                : (11, 7)
Thrust Version               : 101500
CUB Build Version            : 101500
Jitify Build Version         : 4a37de0
cuDNN Build Version          : 8401
cuDNN Version                : 8401
NCCL Build Version           : 21304
NCCL Runtime Version         : 21304
cuTENSOR Version             : 10600
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA RTX A6000
Device 0 Compute Capability  : 86
Device 0 PCI Bus ID          : 0000:17:00.0
Device 1 Name                : Quadro P620
Device 1 Compute Capability  : 61
Device 1 PCI Bus ID          : 0000:65:00.0

Additional Information

This operation only seems to fail for the example above when the “singleton” dimension is present and not in either the first or last position of the shape. For example the following shapes all work fine:

(1, 4, 5)
(5, 4, 1)
(2, 3, 4)

but the following examples with a singleton in one of the center positions also has the same error

(2, 3, 1, 5)
(2, 1, 3, 5)

The failure is not specific to subtraction, but also occurs for other operations (+, / , *)

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:2
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
grlee77commented, Jul 28, 2022

Ah, I see that the install docs don’t yet list 1.6.0 as supported. I had that version because I am using recent Ubuntu 20.04 packages.

I just tried with a pip install of cupy 11.x and cutensor 1.5.0 obtained via

python -m cupyx.tools.install_library --cuda 11.x --library cutensor

and do not see the issue there.

Read more comments on GitHub >

github_iconTop Results From Across the Web

cuTENSOR Functions
descA – [in] A descriptor that holds the information about the data type, modes, and strides of A. modeA – [in] Array (in...
Read more >
Remove singleton dimensions from multidimensional signal
A singleton dimension is any dimension whose size is one. The Squeeze block operates only on signals whose number of dimensions is greater...
Read more >
Github Com Cupy Cupy Issues 4902
cuTENSOR 1.6 : failure in math operations on some arrays with singleton. I observed one test failure in cuCIM today when running the...
Read more >
v9.2.0 PDF
If you are using certain versions of conda, it may fail to build CuPy with ... CuPy is a GPU array backend that...
Read more >
LmodWeb - HPC High Performance Computing: Home
Skylake. Skylake (SKL) is Intel's microarchitecture that was launched in August 2015 succeeding the Broadwell. Distributions: (1) Rocky/8.4. Packages: 324.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found