question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to expose API to downstream libraries?

See original GitHub issue

I wanted to open a discussion on how the Array API (and potentially the dataframe API) will be exposed to downstream libraries.

For example, let’s say I am the author of scikit-learn. How do I get access to an “Array compatible API”? Or let’s say I am a downstream user, using scikit-learn in a notebook. How can I tell it to use Tensorflow over NumPy?

Options

I present three options here, but I would appreciate any suggestions on further ideas:

Manual

The default option is the current status quo where there is no standard way to get access to some array conformant API backend.

Different downstream libraries, like scikit-learn, could introduce their own mechanisms, like a backend kwarg to functions, if they wanted to support different backends.

Local Dispatch

Another approach, would be to provide access to the related module from particular instances of the objects, which is the one taken by NEP 37.

In this case, scikit-learn would either call some x.__array_module__() method on its inputs or we would provide a array-api Python package that would have a helper function like get_array_module(x), similar to the NEP.

There is an open PR in scikit-learn (https://github.com/scikit-learn/scikit-learn/pull/16574) to add support for NEP 37.

Global Dispatch

Instead of requiring an object to inspect, we could instead rely on a global context to store the “active array api” and provide ways of getting and settings this. Some form of this is implemented by scipy, with their scipy.fft.set_backend, which uses uarray.

This would be heavier weight than we would need, probably, but does illustrate the general concept. I think if we implemented this, we could use Context Variables like python’s built in decimal module does. i.e. something like this:

from array_api import set_backend, get_backend

import cupy

with set_backend(cupy):
    some_fn()

def some_fn():
    np = get_backend()
    return np.arange(10)

The advantage of using a global dispatch is then you don’t need to rely on passing in some custom instance class to set the backend.

Static Typing

This is slightly tangential, but one question that comes up for me is how we could properly statically type options 2 or 3. It seems like what we need is a typing.Protocol but for modules. I raised this as a discussion point on the typing-sig mailing list.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (12 by maintainers)

github_iconTop GitHub Comments

2reactions
rgommerscommented, Jan 26, 2021

This is indeed appealing in its simplicity, and would suffice for many use cases, i.e., code that only uses one array type.

The answer we arrived on here is that even if there are multiple array types involved, those should come from the same library - in which case there is no dispatch problem.

I think we can close this?

2reactions
shoyercommented, Aug 19, 2020

One other alternative to __array_module__, would be similar be takes no args, and just operates on the array object itself, i.e. x.__array_module() instead of x.__array_module__((type(x)),). This seems like it would be a bit more straightforward for libraries to implement?

This is indeed appealing in its simplicity, and would suffice for many use cases, i.e., code that only uses one array type.

It doesn’t solve the bigger “multiple library dispatch” problem, but for many projects that isn’t so important. Multi-library dispatch could perhaps be added separately with another protocol that determines which array takes priority, and which perhaps could get reused for Python binary arithmetic.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Configure a web API that calls web APIs - Microsoft Entra
A web API will need to acquire a token for the downstream API. Specify it by adding the .EnableTokenAcquisitionToCallDownstreamApi() line after ...
Read more >
3 types of API fragments design strategies - MuleSoft Blog
In this blog, I'll walk you through the strategies in designing your API fragments, following the REST standards.
Read more >
Downstream pipelines - GitLab Docs
A downstream pipeline is any GitLab CI/CD pipeline triggered by another pipeline. Downstream pipelines run independently and concurrently to the upstream ...
Read more >
Multiprocessing within downstream library class methods
As mentioned, these don't show the error if I run the class directly, but rather when the class is triggered by an upstream...
Read more >
API gateway pattern - Microservice Architecture
Rather than provide a one-size-fits-all style API, the API gateway can expose a different API for each client. For example, the Netflix API...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found