BUG: int64 and uint64 values converted to float64 when concatenated
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample
import pandas as pd
pd.concat([
pd.Series([1,2], dtype='uint64'),
pd.Series([3,4], dtype='int64')
])
# Output:
0 1.0
1 2.0
0 3.0
1 4.0
dtype: float64
Problem description
The output datatype becomes float when the input datatypes are different integer types.
Since there are no NaN-values or decimals involved, it can be confusing that you suddenly get float output.
Expected Output
0 1
1 2
0 3
1 4
dtype: int64
union(int64, uint64) ⊆ int65, which is the same as int64 unless you have numbers close to “the edge”.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None python : 3.7.3.final.0 python-bits : 64 OS : Linux OS-release : 4.9.125-linuxkit machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 0.25.0 numpy : 1.17.0 pytz : 2019.2 dateutil : 2.8.0 pip : 19.2.1 setuptools : 41.0.1 Cython : 0.29.13 pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.5.0 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.10.1 IPython : 7.7.0 pandas_datareader: None bs4 : 4.8.0 bottleneck : None fastparquet : 0.3.2 gcsfs : None lxml.etree : 4.5.0 matplotlib : 3.1.1 numexpr : 2.6.9 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None s3fs : None scipy : 1.3.0 sqlalchemy : 1.3.6 tables : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (10 by maintainers)
Top GitHub Comments
See https://pandas.pydata.org/docs/dev/development/extending.html#extension-types and https://pandas.pydata.org/docs/dev/user_guide/integer_na.html
@bergkvist I am going to close this because the original issue (concatenating int64 and uint64 giving float64) is not something we are going to change (that discussion needs to happen in numpy, which there also already has been some).
But we can certainly further discuss the usability problems regarding unsigned integers in pandas, like the indexing you mentioned.