[0.24.1] New nullable integer fillna with non-int doesn't coerce to object
See original GitHub issueCode Sample
import pandas as pd
sample_data = []
sample_data.append({"integer_column":None})
sample_data.append({"integer_column":1})
sample_data.append({"integer_column":2})
df = pd.DataFrame(sample_data)
# Previous type is object
# df.dtypes
df.loc[:,'integer_column'] = df.loc[:,'integer_column'].astype('Int64')
# Check new type is Int64, nullable
# df.dtypes
df.fillna('null_string')
Problem description
Using the new nullable type Int64, it is not possible to fill “NaN” values with other value.
Error raised
TypeError: <U11 cannot be converted to an IntegerDtype
Expected Output
The new dataframe should have replaced it’s NaN values with the desired input of .fillna() method.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 85 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None
pandas: 0.24.1 pytest: 3.3.2 pip: 9.0.1 setuptools: 38.4.0 Cython: 0.27.3 numpy: 1.14.0 scipy: 1.0.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.6.6 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.1.2 openpyxl: 2.5.12 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.2 lxml.etree: 4.1.1 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.1.18 pymysql: None psycopg2: None jinja2: 2.8.1 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None gcsfs: None
Issue Analytics
- State:
- Created 5 years ago
- Reactions:3
- Comments:22 (15 by maintainers)
I would assume that
.fillna()
would coerce the series into being of type object when I am trying to fill it with an object. Just like for example when adding a float to it coerces it into being of type float:However;
i brought it up here: https://github.com/pandas-dev/pandas/issues/25288#issuecomment-592917095
i still think it makes more sense to be able to just use
.fillna()
without explicitly casting first. as i stated earlier in this thread, this is default behaviour for.fillna()
on other dtypes, and it also is a logical step when for example adding a float to an Int64.what you are suggesting from a user perspective, is that now sometimes i can
.fillna()
directly, and sometimes i will have to cast + fill. as a user i would feel more for consistent behaviour of.fillna()
.