pd.read_sql_query() does not convert NULLs to NaN
See original GitHub issueA small, complete example of the issue
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('sqlite://')
conn = engine.connect()
conn.execute("create table test (a float)")
for _ in range(5):
conn.execute("insert into test values (NULL)")
df = pd.read_sql_query("select * from test", engine, coerce_float=True)
print(df.a)
Expected Output
In pandas 0.18.1 this will result in a column of type object
with None
values, whereas I needed float(“nan”). The coerce_float=True option made no difference. This is most needed, when reading in a float column chunk-wise, since there may be sequences of NULLs.
(also http://stackoverflow.com/questions/30652457/adjust-pandas-read-sql-query-null-value-treatment/)
Issue Analytics
- State:
- Created 7 years ago
- Reactions:3
- Comments:12 (7 by maintainers)
Top Results From Across the Web
Pandas read_sql_query returning None for all values in some ...
It seems that read_sql_query only checks the first 3 values returned in a column to determine the type of the column. So if...
Read more >Why are there 'None's in my dataframe? - pandasninja
Normally NULL values are converted to numpy.nan in pandas.read_sql() . Except when all the values are NULL in a returned column, because then...
Read more >pandas.read_sql — pandas 1.5.2 documentation
Read SQL query or database table into a DataFrame. This function is a convenience wrapper around read_sql_table and read_sql_query (for backward ...
Read more >Pandas Replace NaN with Blank/Empty String
By using replace() or fillna() methods you can replace NaN values with Blank/Empty string in Pandas DataFrame. NaN stands for Not A Number...
Read more >Pandas - Cleaning Empty Cells - W3Schools
Note: Now, the dropna(inplace = True) will NOT return a new DataFrame, but it will remove all rows containing NULL values from the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
My actual query is more complicated than that and involves multiple tables. So I can’t just use pd.read_sql_table. What I am doing at the moment is just converting None to NaN in the dataframe:
df.replace([None], np.nan, inplace=True)
Let’s actually reopen this, is it worth adding a
coerce_null
parameter toread_sql_query
to handle cases like this?