question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MsSqlHook.get_sqlalchemy_engine uses pyodbc instead of pymssql

See original GitHub issue

Apache Airflow Provider(s)

microsoft-mssql

Versions of Apache Airflow Providers

apache-airflow-providers-microsoft-mssql==2.0.1

Apache Airflow version

2.2.2

Operating System

Ubuntu 20.04

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened

MsSqlHook.get_sqlalchemy_engine uses the default mssql driver: pyodbc instead of pymssql.

  • If pyodbc is installed: we get sqlalchemy.exc.InterfaceError: (pyodbc.InterfaceError)
  • Otherwise we get: ModuleNotFoundError

PS: Looking at the code it should still apply up to provider version 3.0.0 (lastest version).

What you think should happen instead

The default driver used by sqlalchemy.create_engine for mssql is pyodbc.

To use pymssql with create_engine we need to have the uri start with mssql+pymssql:// (currently the hook uses DBApiHook.get_uri which starts with mssql://.

How to reproduce

>>> from contextlib import closing
>>> from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
>>>
>>> hook = MsSqlHook()
>>> with closing(hook.get_sqlalchemy_engine().connect()) as c:
>>>     with closing(c.execute("SELECT SUSER_SNAME()")) as res:
>>>         r = res.fetchone()

Will raise an exception due to the wrong driver being used.

Anything else

Demo for sqlalchemy default mssql driver choice:

# pip install sqlalchemy
... Successfully installed sqlalchemy-1.4.39
# pip install pymssql
... Successfully installed pymssql-2.2.5
>>> from sqlalchemy import create_engine
>>> create_engine("mssql://test:pwd@test:1433")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 2, in create_engine
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/deprecations.py", line 309, in warned
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/create.py", line 560, in create_engine
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/connectors/pyodbc.py", line 43, in dbapi
    return __import__("pyodbc")
ModuleNotFoundError: No module named 'pyodbc'

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
potiukcommented, Jul 21, 2022

@potiuk I might have misinterpreted your question here

Yep. This is normal that people are creating scenarios in their heads which are not in the heads of the others. It was just a question where I tried to find out what’s your motivation and whether PyODBC is not enough. I can imagine very well that you could use the ODBC one. Maintainers care for maintaining the project, but often when they see a change and propsal they ask questions to find out what the motivations are and where things come from.

Just (so you know) - none of us know everything about those two drivers by heart. Everything there is - is in the code.

If you imagine that any of people here by heart know all the 3000+ classes and 75+ providers implemented - this is a wrong assumption. Much of Airflow code has been contributed by people like you (we have > 2100 contributors) and there is not a single person that knows everything nor has plans about deprecation or removal of any providers there (this is also the reason why we have unit tests - because they ultimately check if the code contributed still works).

If there ar such plans, this is always public on the devlist and the only way it can happen is by updating the code here and making notes in the release notes - there is no “secret organisation” that has some plans on deprecation here.

But there is nothing wrong with asking questions “why” and drawing conclusions from those (but especially jumping to such conclusions you did from just asking a question is a bit premature 😃.

1reaction
FanatoniQcommented, Jul 15, 2022

I started working on a fork. I’ll make a PR next week once it’s ready

Read more comments on GitHub >

github_iconTop Results From Across the Web

Apache Airflow - Connection issue to MS SQL Server using ...
It seems come from pyodbc whereas I want to use pymssql (and in MsSqlHook, the method get_conn uses pymssql !) I searched in...
Read more >
pymssql - Python driver for SQL Server - Microsoft Learn
This guide describes installing Python, the ODBC Driver for SQL Server, and pyodbc. Sample code shows how to connect to and interact with...
Read more >
pymssql vs pyodbc - Google Groups
While pymssql has some limitations and rough edges, it is working reliably to provide access to MS SQL from my application whether the...
Read more >
Microsoft SQL Server - SQLAlchemy 1.4 Documentation
Use the information in the identity key instead. ... Both pyodbc and pymssql return values from BIT columns as Python <class 'bool'> so...
Read more >
4.3. Databases and database drivers - CRATE - Read the Docs
SQL Server + django-pymssql ... SQL Server (or other) + django-pyodbc-azure ... It is a wrapper around django-mssql that uses pymssql instead of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found