Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: int64 and uint64 values converted to float64 when concatenated

See original GitHub issue

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample

import pandas as pd

pd.concat([
    pd.Series([1,2], dtype='uint64'),
    pd.Series([3,4], dtype='int64')
])

# Output:
0    1.0
1    2.0
0    3.0
1    4.0
dtype: float64

Problem description

The output datatype becomes float when the input datatypes are different integer types.

Since there are no NaN-values or decimals involved, it can be confusing that you suddenly get float output.

Expected Output

0    1
1    2
0    3
1    4
dtype: int64

union(int64, uint64) ⊆ int65, which is the same as int64 unless you have numbers close to “the edge”.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : None python : 3.7.3.final.0 python-bits : 64 OS : Linux OS-release : 4.9.125-linuxkit machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 0.25.0 numpy : 1.17.0 pytz : 2019.2 dateutil : 2.8.0 pip : 19.2.1 setuptools : 41.0.1 Cython : 0.29.13 pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.5.0 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.10.1 IPython : 7.7.0 pandas_datareader: None bs4 : 4.8.0 bottleneck : None fastparquet : 0.3.2 gcsfs : None lxml.etree : 4.5.0 matplotlib : 3.1.1 numexpr : 2.6.9 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None s3fs : None scipy : 1.3.0 sqlalchemy : 1.3.6 tables : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None

Issue Analytics

State:
Created 3 years ago
Comments:14 (10 by maintainers)

Top GitHub Comments

1reaction

jorisvandenbosschecommented, May 25, 2020

See https://pandas.pydata.org/docs/dev/development/extending.html#extension-types and https://pandas.pydata.org/docs/dev/user_guide/integer_na.html

0reactions

jorisvandenbosschecommented, May 26, 2020

@bergkvist I am going to close this because the original issue (concatenating int64 and uint64 giving float64) is not something we are going to change (that discussion needs to happen in numpy, which there also already has been some).

But we can certainly further discuss the usability problems regarding unsigned integers in pandas, like the indexing you mentioned.