BUG: erroneous initialization of a DataFrame with Series objects
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
x = pd.Series(["a", "b", "c"])
y = pd.Series([1, 2, 3])
pd.DataFrame(y, x)
>>> 0
>>> a NaN
>>> b NaN
>>> c NaN
pd.DataFrame(x, y)
>>> 0
>>> 1 b
>>> 2 c
>>> 3 NaN
pd.DataFrame(x.values, y.values)
>>> 0
>>> 1 a
>>> 2 b
>>> 3 c
Problem description
I would expect pd.Series
objects to be valid inputs for the DataFrame
constructor.
If this is not the case a warning (or even raising an error) would be nice…
Output of pd.show_versions()
INSTALLED VERSIONS
commit : c7f7443c1bad8262358114d5e88cd9c8a308e8aa python : 3.9.6.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.17763 machine : AMD64 processor : AMD64 Family 25 Model 33 Stepping 0, AuthenticAMD byteorder : little LC_ALL : None LANG : en LOCALE : German_Austria.1252
pandas : 1.3.1 numpy : 1.21.1 pytz : 2021.1 dateutil : 2.8.2 pip : 21.2.1 setuptools : 49.6.0.post20210108 Cython : None pytest : 6.2.4 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.4.2 numexpr : 2.7.3 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : 1.7.0 sqlalchemy : None tables : 3.6.1 tabulate : None xarray : None xlrd : None xlwt : None numba : None
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (6 by maintainers)
@tyuyoshi sure. go for it!
Hi, can I pick this issue as my first OSS contribution?