question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

OdbcHook string values in connect_kwargs dict converts to None

See original GitHub issue

Apache Airflow version: 2.0.1

What happened:

OdbcHook returns None for non-boolean-like string values in connect_kwargs dict arg

What you expected to happen:

connect_kwarg values should remain as is.

How to reproduce it:

>>> from airflow.providers.odbc.hooks.odbc import OdbcHook
>>> OdbcHook('my_conn', connect_kwargs={'CurrentSchema': 'SCHEMA'}).connect_kwargs
{'CurrentSchema': None}

Anything else we need to know:

The issue lies in: https://github.com/apache/airflow/blob/db9febdb3be97832679d2ced8028fd7f1c21cd4e/airflow/providers/odbc/hooks/odbc.py#L170-L173

There’s no else block that returns val. As it is, any string value will instead return None.

Pardon my ignorance on the subject, but this raises a question: Why is clean_bool being called in the first place for a user-provided dictionary? I’m not sure how this is necessary because the user can provide a literal boolean value in the dictionary if needed, no? If in the event that a driver needs to take a case-sensitive boolean string for some parameter, then clean_bool would make it impossible to provide such a value.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
dstandishcommented, May 10, 2021

Why is clean_bool being called in the first place for a user-provided dictionary? I’m not sure how this is necessary because the user can provide a literal boolean value in the dictionary if needed, no? If in the event that a driver needs to take a case-sensitive boolean string for some parameter, then clean_bool would make it impossible to provide such a value.

TLDR I agree we should remove clean_bool

As to why it was put in, in the first place, here’s what I think happened.

Observe what happens with airflow’s connection URI format when we try to pass a boolean value:

>>> c = Connection(conn_id='hello', uri='hello://?my_val=True')
>>> c.extra_dejson
{'my_val': 'True'}

It’s impossible to produce a json object with boolean values.

So when you are using top level key-value pairs in conn extra then in some cases it makes sense to cast to bool.

I suspect maybe initially in the development of this hook the connect kwargs were top level within extra, where doing this cast would make sense.

But when dealing with nested json, the boolean vs. string issue becomes irrelevant and you have new problems to solve. Namely, at the time this hook was merged, airflow’s conn URI did not support nested json. So this hook did not actually allow for storage of connect_kwargs in extra when using the URI format. For that, you’d have to add a json.loads-if-str conversion here. But now that we have support for arbitrary json in conn extra, there’s no need for such a conversion.

So since connect_kwargs is nested json, there’s no valid reason for converting to bool, and I suspect this was just an oversight, and accordingly, even though it could be fixed, it is best to remove.

1reaction
Goodkatcommented, Apr 26, 2021

@marcosmarxm I have removed clean_bool on my local copy and can confirm that it returns the schema_name instead of None now. I used the folowing UI extras for my tests and it works for unquoted boolean values as well:

{
  "connect_kwargs": {
    "CurrentSchema": "SCHEMA",
    "Blah": true,
    "Blah2": false
  }
}
>>> from airflow.providers.odbc.hooks.odbc import OdbcHook                                         
>>> OdbcHook('myconn').connect_kwargs
{'CurrentSchema': 'SCHEMA', 'Blah': True, 'Blah2': False}

As the next step, I will make and commit the changes within the PR #15510

Read more comments on GitHub >

github_iconTop Results From Across the Web

airflow.providers.odbc.hooks.odbc
Hook connect_kwargs precedes connect_kwargs from conn extra. If attrs_before provided, keys and values are converted to int, as required by pyodbc. Returns a ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found