pd.concat does not work correctly
See original GitHub issueCode Sample, a copy-pastable example if possible
df1.shape # (21141, 59)
df2.shape # (21141, 6)
result = pd.concat([df1, df2], axis=1, ignore_index=True)
result.shape # (42282, 65)
Problem description
I have 2 dataframes that I try to concatenate horizontally. The method concat doesn’t work: it returns a dataframe with a wrong dimension. Moreover, all column names happen to be changed to numbers going from 0 to 64…
The dataframes are created from a dataset that is a bit big so I cannot reproduce the creation code here but I can provide you with more details by e-mail.
Expected Output
The right dimension should be (21141, 65) and the resulting columns should be just the concatenation of df1’s columns and df2’s columns.
Output of pd.show_versions()
pandas: 0.22.0 pytest: 3.3.2 pip: 18.0 setuptools: 39.0.1 Cython: 0.27.3 numpy: 1.14.2 scipy: 1.0.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.6.6 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.1.2 openpyxl: 2.4.10 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.2 lxml: 4.1.1 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.2.1 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (1 by maintainers)
It’s hard to say without a minimal example, but it appears that you’re getting confused by the alignment. See http://pandas-docs.github.io/pandas-docs-travis/reference/api/pandas.concat.html?highlight=concat#pandas.concat
Specifically, the
ignore_index_parameter
Since you’re using
axis=1
,[0, n)
If you really don’t care about your row labels, then you’ll want to drop the row labels before concating
pd.concat([df1.reset_index(drop=True), df2.reset_index(drop=True)], ...)
I agree with @Mark531 there should be an intuitive manner to merge dataframes horizontally. the documentation on ignore_index=True is unclear, I also spent time on this.