assert_frame_equal not differentiating NaN and None within object dtype
See original GitHub issueCode Sample, a copy-pastable example if possible
df1 = pd.DataFrame([['foo', 'bar', 'baz'], [None, None, None]])
df2 = pd.DataFrame([['foo', 'bar', 'baz'], [np.nan, np.nan, np.nan]])
tm.assert_frame_equal(df1, df2)
Problem description
No AssertionError
gets raised in this case, even though I would not expect None
and np.nan
to be considered equal values (I found this while testing #18450)
Note that if you simply compared a DataFrame with only np.nan
with another containing only None
you would get an AttributeError
that the dtypes are different (float64
vs object
) but because the missing values are mixed into object dtypes here I think that differentiation gets lost
Expected Output
AssertionError: DataFrame.iloc[1, :] are different
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: d64995a4fcc3269dff4366988230563b8aeffb9f python: 3.6.2.final.0 python-bits: 64 OS: Darwin OS-release: 17.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.22.0.dev0+205.gd64995a4f pytest: 3.2.5 pip: 9.0.1 setuptools: 36.4.0 Cython: 0.26 numpy: 1.13.1 scipy: None pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.6.3 patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (6 by maintainers)
Top GitHub Comments
from above, this is actually wrong and we should strictly check this.
I think this also affects pd.NA.
A keyword to control this would be great. Based on the recent experience with
check_freq
, we should perhaps not make things strict by default. But we could do non-strict + a warning by default, with the option to override in the function or via a global option.