question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

impala connection via sqlalchemy

See original GitHub issue

I’m new to hadoop and impala. After many days, I’ve managed to connect to using

from impala.dbapi import connect from impala.util import as_pandas conn = connect(host="server.lrd..com",port=21050, database='tcad',auth_mechanism='PLAIN', user="alexcj", use_ssl=True,timeout=20, password="secret1pass")

cursor = conn.cursor() cursor.execute('SELECT * FROM bom_2014_m LIMIT 10') df = as_pandas(cursor)

Basically, I’m finally able to connect, query and create a dataframe from returned results.

Now, how do I connect to impala with sqlalchemy? I’ve seen

engine = create_engine('impala://localhost') in the test files but my connection string has whole lot more parameters than the host. How do I pass all my parameters above to create_engine? I’m designing a backend api with flask that will query from impala so this step is important. If you’re familiar with flask-sqlalchemy, I’d like to know exactly what to pass to SQLALCHEMY_DATABASE_URI . If not, I’m willing to use sqlalchemy’s declarative extension hence this question. Thanks.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:7

github_iconTop GitHub Comments

7reactions
pyite1commented, Sep 28, 2016
import sqlalchemy
def conn():
    return connect(host='some_host', 
                             port=21050,
                             database='default',
                             timeout=20,
                             use_ssl=True,
                             ca_cert='some_pem',
                             user=user, password=pwd,
                             auth_mechanism='PLAIN')

engine = sqlalchemy.create_engine('impala://', creator=conn)

you may or may not need some of those parameters

1reaction
jrburriscommented, May 25, 2017

Does anyone know what is needed to connect to impala if we are using Kerberos?

Read more comments on GitHub >

github_iconTop Results From Across the Web

impala connection via sqlalchemy - python - Stack Overflow
I'd like to be able use sqlalchemy to connect to impala and be able to use some nice sqlalchemy functions. I found a...
Read more >
Use SQLAlchemy ORMs to Access Impala Data in Python
The CData Python Connector for Impala enables you to create Python applications and scripts that use SQLAlchemy Object-Relational Mappings of Impala data. The ......
Read more >
Connect SQLalchemy to Cloudera Impala or Hive
Below code will connect to Impala with Kerberos enabled. You can also connect to Hive by changing host and port to 10000.
Read more >
Configuring Impyla for Impala | CDP Public Cloud
Explains how to install Impyla to connect to and submit SQL queries to Impala. Impyla is a Python client wrapper around the HiveServer2...
Read more >
Python and Impala — Quick Overview and Samples - SoftKraft
Now to the crux of our content. Impala is an MPP (Massive Parallel Processing) SQL query engine for processing huge volumes of data...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found