question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discussion for possible enhancements of the new CUB support

See original GitHub issue

With the great effort by @anaruse in #2090, I’ve seen encouraging performance boosts. Below is a list of possible improvements I can think of, either for offering extensive support or for enabling even more boost. I am interested in knowing what I’ve missed or misunderstood.

  • Support complex numbers (#2538)
  • Allow using CUDA streams: (#2555)

if users set up a context manager like this

with stream:
    arr.sum()
    # do other stuff

The non-default stream should be honored. All of the CUB functions introduced in #2090 support an optional stream argument. We just need to pick up the current stream pointer during setup and modify the wrappers.

currently they are all Python def functions. Could be be beneficial for performance. In particular, if we don’t want to expose those wrappers to end users, cdef would be a nice choice.

  • Support batch reduction for contiguous arrays (#2562):

currently only a full reduction is supported, but if a reduction over the last axes of a contiguous array of shape, say, (X, Y, Z), is needed, this seems possible with a naive loop over the remaining axes. In other words, in this case we can use CUB to do arr.sum(axis=2) or arr.sum(axis=(1,2)), assuming arr is C contiguous. This resembles the current treatment of PlanNd in the FFT module.

  • Document how to enable CUB support if built from source: Need to set CUB_PATH and CUB_DISABLED. -> could be avoided if the CUB source code is bundled (#2584)
  • Support argmin and argmax (#2596 enables a global (no axis) search)
  • Support half-precision floating points (#2600)
  • Support F-contiguous arrays (#2682)
  • Support sparse matrix operation (#2698)
  • Honor the keepdims argument (#2725)

Question: (from https://github.com/cupy/cupy/pull/2508#issuecomment-536368493): is Jenkins configured to test CUB functionalities? UPDATE: No, see https://github.com/cupy/cupy/pull/2538#issuecomment-543507886.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:26 (26 by maintainers)

github_iconTop GitHub Comments

2reactions
leofangcommented, Mar 10, 2020

I am closing this meta-issue, as most of the listed tasks are completed except for the source-tree bundling (#2584) and the tests (#2598). Thanks to everyone for making all these improvements! It’s been a not-so-short, perhaps a bit bumpy journey since @anaruse added the initial support. 🙂

1reaction
grlee77commented, Nov 29, 2019

As documented in https://github.com/cupy/cupy/pull/2725#issuecomment-559879307, we could also use CUB to accelerate mean (which is also used internally by std and var)

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUB 1.16.0 · Discussion #434 · NVIDIA/cub
CUB 1.16.0 is a major release providing several improvements to the device scope algorithms. DeviceRadixSort now supports large (64-bit indexed) input data.
Read more >
Join us Fridays for #CubChatLive! - On Scouting
In this #CubChatLive, we'll discuss easy ways to welcome and support new unit leaders so your pack stays strong. 10/21/2022, How to Identify...
Read more >
Welcome to CUB Help Center: High Energy Prices
See if you qualify for LIHEAP energy assistance. The application process has reopened. Contact your utility to learn about energy assistance, payment plans, ......
Read more >
GUIDE TO ADVANCEMENT 2021
Policies and procedures outlined in the Guide to Safe Scouting apply to all BSA activities, including those related to advancement and Eagle Scout,...
Read more >
Discovering New Features in CUDA 11.4
Discuss (0) ... C++ Language support – CUDA; Compiler enhancements ... New versions are now available for NVIDIA Nsight Visual Studio Code ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found