question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Inconsistent behavior when DataFrame with strings and None is created from lists or dictionary.

See original GitHub issue

Code Sample

import pandas as pd


pd.__version__

# df created from list, None is casted to str
df = pd.DataFrame(["1", "2", None], columns=["a"], dtype="str")
type(df.loc[2].values[0])


# Equivalent df created from dict, None remains NoneType
df = pd.DataFrame({"a": ["1", "2", None]}, dtype="str")
type(df.loc[2].values[0]))

Problem description

None is not casted consistently for DataFrames with None values and dtype set to str.

If DataFrame is created from list, then None is casted to str -> None -> “None”. If DataFrame is created from dict, then None remains NoneType -> None -> None.

IMO, the latter is the preferred behavior. I hope you consider this when continuing your work on string types and na types in future versions.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None python : 3.7.5.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None

pandas : 1.0.1 numpy : 1.18.1 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 41.2.0 Cython : None pytest : 5.3.5 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.1 IPython : 7.12.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.1.3 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 0.16.0 pytables : None pytest : 5.3.5 pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : None

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
prakhar987commented, Feb 25, 2020

I am assuming one is wrong, the case where None becomes “None”.

0reactions
cgarciaecommented, Apr 12, 2021

take

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas transform inconsistent behavior for list - Stack Overflow
I've come across a similar issue before. The underlying issue I think is when the number of elements in the list matches the...
Read more >
What's new in 1.4.0 (January 22, 2022) - Pandas
These are bug fixes that might have notable behavior changes. Inconsistent date string parsing#. The dayfirst option of to_datetime() isn't strict ...
Read more >
Using Pandas and Python to Explore Your Dataset
First is a familiarity with Python's built-in data structures, especially lists and dictionaries. For more information, check out Lists and Tuples in Python ......
Read more >
Pandas Convert List of Dictionaries to DataFrame
Below are quick example # Create a list of dictionary objects ... like to change the NaN values refer to How to replace...
Read more >
[Solved]-Pandas to_datetime has inconsistent behavior on ...
Coding example for the question Pandas to_datetime has inconsistent behavior on non-american dates-Pandas,Python.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found