question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error when using XtPlanNd for FP16 R2C transformation

See original GitHub issue

Description

Since XtPlanNd isn’t documented beyond this sample, I might just be holding it wrong (in particular, I had to figure out what last_axis and last_size are for). I tried to modify that example to do a real-to-complex transform as attached below.

When run, it gives this output:

Traceback (most recent call last):
  File "./cupy_ft16.py", line 16, in <module>
    plan = cp.cuda.cufft.XtPlanNd(shape[1:],
  File "cupy/cuda/cufft.pyx", line 968, in cupy.cuda.cufft.XtPlanNd.__init__
  File "cupy/cuda/cufft.pyx", line 1068, in cupy.cuda.cufft.XtPlanNd._sanity_checks
ValueError: size must be power of 2

I believe the issue is this check, which is okay for C2C and C2R, but (assuming I’ve supplied last_size correctly) in R2C it is failing because the last dimension of the output array is floor(n/2)+1, which is not a power of 2 even though the problem size is a power of 2.

On a semi-related note, those checks are also not testing this condition on FP16 transforms from the CUDA docs:

  • Strides on the real part of real-to-complex and complex-to-real transforms are not supported

To Reproduce

#!/usr/bin/env python3

import cupy as cp
import numpy as np


shape = (1024, 65536)  # input array shape
idtype = 'e'  # numpy.float16
odtype = edtype = 'E'  # = numpy.complex32 in the future

# store the output array as fp16 arrays twice as long, as complex32 is not yet available
a = cp.random.random(shape).astype(cp.float16)
out = cp.empty_like(a, shape=(shape[0], shape[1] + 2))

# FFT with cuFFT
plan = cp.cuda.cufft.XtPlanNd(shape[1:],
                              a.shape[1:], 1, a.shape[1], idtype,
                              (out.shape[1] // 2,), 1, (out.shape[1] // 2), odtype,
                              shape[0], edtype,
                              order='C', last_axis=-1, last_size=out.shape[-1] // 2)

plan.fft(a, out, cp.cuda.cufft.CUFFT_FORWARD)

# FFT with NumPy
a_np = cp.asnumpy(a).astype(np.float32)  # upcast
out_np = np.fft.rfftn(a_np, axes=(-1,))
out_np = np.ascontiguousarray(out_np).astype(np.complex64)  # downcast
out_np = out_np.view(np.float32)
out_np = out_np.astype(np.float16)

# don't worry about accruacy for now, as we probably lost a lot during casting
print('ok' if cp.mean(cp.abs(out - cp.asarray(out_np))) < 0.1 else 'not ok')

Installation

Wheel (pip install cupy-***)

Environment

OS                           : Linux-5.17.5-76051705-generic-x86_64-with-glibc2.29
Python Version               : 3.8.10
CuPy Version                 : 10.5.0
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.21.5
SciPy Version                : 1.8.1
Cython Build Version         : 0.29.24
Cython Runtime Version       : 0.29.21
CUDA Root                    : /usr/local/cuda
nvcc PATH                    : /usr/local/cuda/bin/nvcc
CUDA Build Version           : 11040
CUDA Driver Version          : 11060
CUDA Runtime Version         : 11040
cuBLAS Version               : (available)
cuFFT Version                : 10502
cuRAND Version               : 10205
cuSOLVER Version             : (11, 2, 0)
cuSPARSE Version             : (available)
NVRTC Version                : (11, 4)
Thrust Version               : 101201
CUB Build Version            : 101201
Jitify Build Version         : 4a37de0
cuDNN Build Version          : (not loaded; try `import cupy.cuda.cudnn` first)
cuDNN Version                : (not loaded; try `import cupy.cuda.cudnn` first)
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : None
Device 0 Name                : NVIDIA GeForce RTX 2060
Device 0 Compute Capability  : 75
Device 0 PCI Bus ID          : 0000:01:00.0

Additional Information

No response

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
leofangcommented, Jun 21, 2022

@kmaehashi plz assign this to me and I’ll try to get this addressed for CuPy v11.0.

0reactions
leofangcommented, Jun 10, 2022

I’m guessing last_size is only needed to compute the shape of the output if no output is provided?

That’s right.

I think the current checks are not very appropriate to R2C/C2R as you pointed out.

I’m actually wondering if checking only last_size is correct even for C2C. I haven’t tested it, but I would assume that all the transform dimensions (the first argument to XtPlanNd) would need to be powers of 2.

That’s only applicable to C2C/R2C. For C2R it’s the output size that should be power of 2; the input size is an odd number (in the transformed axis).

Let me further note that not much docstring was added for XtPlanNd because it was considered a low-level wrapper over cufftXtMakePlanMany and we expected advanced users to check out cuFFT documentation.

That’s fair, but last_axis and last_size are specific to cupy rather than arguments to cufftXtMakePlanMany.

That’s another fair point. Perhaps I should just add default values to them, and do not show them in the examples. They’re needed for integrating with the high-level APIs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fast Fourier Transform with CuPy — CuPy 11.4.0 documentation
If an out-of-memory error happens, one may want to inspect, clear, or limit the plan cache. Note. The plans returned by get_fft_plan() are...
Read more >
CuPy Documentation - Read the Docs
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found