question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

uarray based backend compatibility tracker

See original GitHub issue

Problem

SciPy adopted uarray to support a multi-dispatch mechanism with the goal being: no need for OpenMP or GPU kernels etc. in the codebase. See motivation below for more concrete discussion.

SciPy currently supports this through the scipy.fft module.

There are other scipy modules that will benefit from uarray backend, and later extending the usage through libraries like CuPy (cupyx.scipy) and Dask (dask.array) etc.

Proposed Modules

  • scipy.ndimage #14356 Note: cupyx.scipy.ndimage has almost all (except a couple: geometric_transform, watershed_ift) functions implemented while dask-image is currently less complete. dask-image has a different namespace structure currently, but dask/dask-image#198 plans to address this.

    • Filters
    • Fourier filters
    • Interpolation
    • Measurements
    • Morphology
  • scipy.linalg #14407 TODO: Add more comprehensive note on cross library availability of functions later. For now, a quick look tells me that not all functions are available in cupy or dask.

    • Basics
    • Eigenvalue Problems
    • Decompositions
    • Matrix Functions
    • Matrix Equation Solvers
    • Sketches and Random Projections
    • Special Matrices
    • Low-level routines
  • scipy.special Note: These are element-wise functions; those can be made to work with dask fairly easily later on. CuPy already has some of the functions.

Obviously once SciPy support is added, these libraries should be updated to make use of uarray, similar to what was done here.


Motivation for uarray

See gh-10204 comment

The protocol not covering things like array creation functions is one thing, but there’s a more important limitation I think: it is specific to “types of arrays”. So if you want to create functions with the same API for GPU arrays (CuPy, PyTorch), distributed arrays (Dask), sparse arrays (scipy.sparse, pydata/sparse), then it works. But what if you want to provide an alternative implementation for ndarrays? You simply cannot do that. Pyfftw, mkl-fft and pypocketfft all work on regular numpy arrays. So letting the numpy array carry around information about what implementation to use is just fundamentally not going to work. Instead, it’s the library that must be able to say “hey, here’s an implementation (perhaps for specific types)”, and a mechanism for either automatic or user-controlled selection of which implementation/backend to use.

See gh-13965 comment

For example, a CUDA-based tensor object from a deep learning framework could invoke CuFFT. I think (not 100% certain) that this also allows you to slot in your own preferred FFT library as a backend even for plain-old numpy ndarray objects. We used to have multiple FFT backends selected at build time, but it was difficult to add new ones, and not easy to support incompatibly-licensed FFT libraries like the popular FFTW. I think this new multidispatch mechanism allows that to be slotted in at runtime.

See gh-13965 comment

It’s possible it will extend to scipy.linalg, as it also has some need to swap out backends like that, but it probably won’t be a widely used pattern across all of scipy.

cc @rgommers @peterbell10 @grlee77 @IvanYashchuk

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:2
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
AnirudhDagarcommented, Jul 9, 2021

Sure, that would be great @czgdp1807! Please confirm the same with @peterbell10 or @rgommers before you start.

1reaction
rgommerscommented, Feb 23, 2022

I see a lot of open PRs to use uarray in modules (both in scipy and cupy). Could you explain what the current plan is for moving forward?

  • Improving the uarray docs so it’s clearer what each uarray feature is for and what the cost of not using it is, so it’s easier to decide on those things,
  • Extract code examples for dispatching on specific functions from PRs (e.g., dev docs examples near the top of the gh-14356 diff),
  • Trying to make the various thread across Discourse, SciPy, scikit-image and scikit-learn converge.
  • Merge support once we’re collectively happy.

The best way to find real-world issues, as well as judge things like code complexity, is to write code. Modules like ndimage and linalg are significantly more complex than fft, so it turns up cases of interest that we need to take into account. Backends are not that difficult to write once you wrap your head around them, so having working backends for the most important SciPy modules is very useful. Which is why we defined internships that included this, first for @AnirudhDagar and now for @Smit-create.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ENH: uarray based backend support for scipy.fft #10383
I'm working on a solution that uses the PyContextVar c-api for python 3.7 and thread_local for earlier versions.
Read more >
uarray Documentation
uarray is a backend system for Python that allows you to separately define an API, along with backends that contain.
Read more >
Array Libraries Interoperability - Quansight Labs
The uarray backend compatibility tracker issue linked above sums up the plan and current state of uarray in SciPy.
Read more >
Collaborative Markdown Knowledge Base - HackMD
For uarray I've been working on adding support to more SciPy modules. This uarray backend compatibility tracker in SciPy issue sums up the...
Read more >
A proposed design for supporting multiple array types across ...
For multiple array types and the kind of dispatch/backend system ... has a demo of dispatching to different backends with a uarray-based ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found