question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PandasCursor converts NULL values in Sttring columns to empty String

See original GitHub issue

In #117, a fix was made for NULL results not being returned by PandasCursor. This now work as expected:

> ret = conn.cursor(PandasCursor).execute("select * from (values (1), (NULL))").fetchall()
> ret
[(1,), (<NA>,)]
> [pd.isna(x[0]) for x in ret]
[False, True]

However, NULL values for String columns are secretly converted to empty Strings:

> ret = conn.cursor(PandasCursor).execute("select * from (values ('bla'), (NULL))").fetchall()
> ret
[('bla',), ('',)]
> [pd.isna(x[0]) for x in ret]
[False, False]

Is this the expected behaviour? I believe NULL should always be converted to NaN, regardless of na_values or keep_default_na.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
jurgispodscommented, Oct 26, 2020

@laughingman7743 Apologies, there was a copy-paste error in my second query. I’ve edited the original post.

0reactions
jurgispodscommented, Oct 27, 2020

I see. One approach would be to disable quoting and remove the quotes afterwards. With your example in mind:

import pandas as pd
import csv

df = pd.read_csv('myfile.csv', skip_blank_lines=False, quoting=csv.QUOTE_NONE)
# remove quotes from String columns manually
df.select_dtypes([object]).apply(lambda col: col.str[1:-1])

This is not very elegant, but if CSVs written by Athena are guaranteed to contain quotes Strings, this should always work. What do you think?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas Replace NaN with Blank/Empty String
In this panda DataFrame article, I will explain how to convert single or multiple (all columns from the list) NaN columns values to...
Read more >
Pandas - how to coerce only empty string values in dataframe ...
This works for floats and int type columns, but it also converts all strings to NaN . The desired results is that only...
Read more >
PyAthena - Bountysource
While trying to use the tool, I found that the data retreived by the cursor is full on null values although the values...
Read more >
How to replace NULL with Empty String in SQL Server ...
Both functions replace the value you provide when the argument is NULL like ISNULL(column, '') will return empty String if the column value...
Read more >
Null is converted to empty string when writing to database.
Even though CDT fields are null, it is updating empty string value in the table ... on the SQL side by always excluding...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found