question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Change API to connect to a backend

See original GitHub issue

Currently, the API to connect to a backend, with a backend specific option, is as follows:

import ibis

ibis.options.impala.temp_db = 'foo'

conn = ibis.impala.connect(host='impala',
                           database='ibis_testing',
                           hdfs_client=ibis.hdfs_connect(host='impala', port=50070))

I think it has few drawbacks, in particular:

  • It’s not very intuitive, since ibis.impala seems to be a impala module in ibis, but it was originally ibis/impala/api.py, now it’s ibis/backends/impala, and potentially it can be a module in any other package
  • The code becomes trickier, since we need to use __getattr__, load the backends dynamically, but ideally keeping the attributes f ibis introspectable, including backends
  • The ibis namespace is already huge, mixing special attributes for backends makes things more complex
  • When a backend is misspelled or inexistent, the error message is not very intuitive, since we can’t know if the user was trying to call a backend, or a regular attribute

An alternative API that I personally find easier is:

import ibis

conn = ibis.connect(engine='impala,
                    host='impala',
                    database='ibis_testing',
                    hdfs_client=ibis.hdfs_connect(host='impala', port=50070),
                    temp_db='foo')

I think this API reduces the magic significantly. There is a function that once called will look for the backend and connect to it. If the backend doesn’t exist, we can provide a clear error message.

The only drawback I see is that the signature of connect won’t be very specific ibis.connect(engine, **kwargs). The backend connect function will have the parameters clearly documented, so I don’t think it makes a big difference for the documentation. But for introspection we would have **kwargs.

Something that could make things simpler is to use a connection string instead of **kwargs:

conn = ibis.connect(conn_str='impala://user@impala/ibis_testing',
                    hdfs_client=ibis.hdfs_connect(host='impala', port=50070),
                    temp_db='foo')

This should work directly with SQLAlchemy backends and OmniSciDB, I guess some backends may require to parse the url in the Ibis backend, but that doesn’t seem like a big deal.

Somehow unrelated, I’d move backend specific options to the connection object, instead of having them as options. I think this will make things easier and clear, both for Ibis maintainers and for users.

I’d add the ibis.connect method for 2.0, and a FutureWarning for ibis.<backend>, and remove the latter in Ibis 3.0.

@jreback happy with this?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
datapythonistacommented, Apr 23, 2021

Agree on that. But technically we can also evaluate lazily on __getattr__ of ibis, and load the backends on ibis.bigquery or equivalent. The reason we’re not doing that are options. The next would fail, since the backend is not loaded and the option doesn’t exist if the backends hasn’t been loaded. That’s the main reason I don’t think we should have backend options.

ibis.options.bigquery.parition_col = 'foo'

ibis.bigquery.connect(connection_string)

On the circular imports, I fail to see how having the backend as a separate repo is affecting it. Everything should be the same, backends are being loaded as entrypoints if they are in this repo too. So, nothing should have changed I’d say. Maybe I’m missing something, but seems to me like the cause should be something else.

0reactions
cpcloudcommented, Dec 16, 2021

Connection string logic seems like a good approach here. There’s a lot discussion on this issue, which I appreciate, but it’s better to work out the details for something this complex against real code so folks can try it out and collaborate.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How the back-end api connects to the front-end app - YouTube
How to have persistent data and be able to change tha... ... How the back-end api connects to the front-end app | PERN...
Read more >
Changing Backend API URL when promoting to higher ...
I have Backend APIs in API Manager that point to Dev systems in Dev environment, say https://backend-dev.com.au/v1. When promoting to test, ...
Read more >
Making Use of APIs in Your Front End | by Patrick Pierre
We need an API key in order to make fetch requests to Unsplash's API. The API key will provide us with authorization to...
Read more >
How to Use an API: Just the Basics 2022
The easiest way to start using an API is by finding an HTTP client online, like REST-Client, Postman, or Paw. These ready-to-use tools...
Read more >
Consume and provide APIs with API Connect and Node.js
In an API Connect platform, a Node.js application provides the endpoints that are called to get backend or system of record data to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found