Duplicated column name causes inconsistent ValueError during assignment to a unique column
See original GitHub issueCode Sample, a copy-pastable example if possible
df1 = pd.DataFrame([[1,2,3,4]], columns=['C','D','D','a'])
df1['a'] = df1['a']
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)
However, the following works without an error message:
df1 = pd.DataFrame([[1,2,3,4]], columns=['C','B','B','a'])
df1['a'] = df1['a']
Problem description
I believe ValueError should not be thrown in the first example. The issue appears to only occur when there is a duplicated column. Interestingly, there appears to be some link to the alphabetical order of the column names.
This appears similar to the issue reported in #21668.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.7.0.final.0 python-bits: 64 OS: Linux OS-release: 4.14.76-1-lts machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.23.4 pytest: None pip: 18.0 setuptools: 40.4.3 Cython: None numpy: 1.15.2 scipy: 1.1.0 pyarrow: None xarray: None IPython: 6.5.0 sphinx: 1.8.0 patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.2.3 openpyxl: 2.5.7 xlrd: 1.1.0 xlwt: None xlsxwriter: None lxml: 4.2.5 bs4: None html5lib: 1.0.1 sqlalchemy: 1.2.12 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
I just built pandas-dev at my end, and seems the issue no longer exists pandas version: 1.2.0.dev0+1174.g4cfa97a18
Fix is on master, did not test other versions