question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RFC: Support performant `einsum()` and `einsum_path()` using cuQuantum

See original GitHub issue

Description

Related to #6078.

The new cuQuantum SDK provides two libraries, one of which is cuTensorNet that accelerates tensor network contraction (backed by cuTENSOR). Specifically, cuTensorNet currently handles two major challenges from problems that have a huge Einsum expression:

  1. Finding the optimal contraction path
  2. Executing the actual pairwise contraction for a given path in an optimal fashion

Those two functionalities map very nicely to NumPy’s einsum_path() and einsum() APIs, respectively. As a result, in cuQuantum Python, the Python binding for cuQuantum libraries, we provide pythonic APIs cuquantum.einsum_path() and cuquantum.einsum() among other things. We currently provide two conda packages, cuquantum and cuquantum-python, and a pip wheel release is in our near-term roadmap.

I would like to propose using cuQuantum Python to back CuPy’s einsum_path() and einsum() when available. Performance improvement data can be shared if needed. However, there are a few issues to be addressed:

  • Because CuPy is currently a required dependency of cuQuantum Python, we need to break any potential circular import.
  • For the initial release, cuquantum.einsum_path() and cuquantum.einsum() are not yet a 100% drop-in replacement of their NumPy counterparts (we mark most optional arguments as unsupported).
  • Directly using our drop-in replacement APIs has a small performance penalty (due to the library handle not reused).
  • We currently only support classical einsum; ellipsis, broadcasting, etc are not yet supported

Below is my 4-step proposal, involving a small refactoring of the existing codebase:

  1. Add cupy.einsum_path() as a fallback: The functionalities are already in cupy/linalg/_einsum.py, we just need to put everything together.
  2. Introduce ACCELERATOR_CUQUANTUM as a optional routine accelerator backend: This way, we allow users to set CUPY_ACCELERATORS=cub,cutensor,cuquantum. This is desired to be able to test things separately (ex: see here).
  3. Use the core functionalities of cuTensorNet to back cupy.einsum_path() and cupy.einsum(): I don’t think I’d like to use cuquantum.einsum_path() and cuquantum.einsum() directly due to the aforementioned reasons, so most likely we’ll need to add some helper functions to check
    • If ACCELERATOR_CUQUANTUM is requested
    • If cuquantum can be imported
    • If all user inputs can be handled by cuTensorNet
    • If so, forward to cuQuantum Python, otherwise fall back
  4. Add some mock tests to ensure the backend can be used when available

Additional Information

No response

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
rgommerscommented, Mar 23, 2022

Thanks for the detailed explanation @leofang! I wasn’t aware of this symlink issue - that seems like a pretty serious design problem with wheels indeed.

and there’s a WIP community effort https://github.com/karellen/wheel-axle-runtime & https://github.com/karellen/wheel-axle to the rescue) some hacks have to be done one way or another. I see no other way out…

Ingenious, but not something that anyone should be using right now for production-quality releases it looks like.

1reaction
leofangcommented, Mar 23, 2022

For today, yes, but wheels should be available soon… 😅

@kmaehashi Well, we hit unexpected delay in the wheel release so we decided to skip cuQuantum v0.1.0 + cuTENSOR 1.4.0, and do cuQuantum v1.0.0 + cuTENSOR 1.5.0 directly. cuTENSOR wheel is up: https://pypi.org/project/cutensor/

(btw, given the lack of symlink support in wheels it’s a bit awkward to build a project against the cuTENSOR wheel (do pip install yourself to see what I mean), so I am not sure if it’d be useful for CuPy… Also notice the wheel size…)

Read more comments on GitHub >

github_iconTop Results From Across the Web

cuquantum.einsum - NVIDIA Documentation Center
A drop-in replacement of numpy.einsum() for computing the specified tensor contraction using cuTensorNet. Not all NumPy options are supported or even used.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found