BUG: outer join out of order when joining multiple DataFrames
See original GitHub issuePandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
a = pd.DataFrame(np.ones((4, 2)), index=[3,2,1,0])
b = pd.DataFrame(2*np.ones((2, 1)), columns=[2], index=[3,1])
c = pd.DataFrame(3*np.ones((2, 1)), columns=[3], index=[3,1])
a.join(b, how='outer') # correct, the index is [0,1,2,3]
b.join(a, how='outer') # correct, the index is [0,1,2,3]
a.join([b, c], how='outer') # incorrect, the index is [3,2,1,0]
b.join([a, c], how='outer') # incorrect, the index is [3,1,2,0]
Issue Description
The document of join says:
outer: form union of calling frame’s index (or column if on is specified) with other’s index, and sort it. lexicographically.
But the behavior is not consistent. when joining more than 2 dataframes, the result remains unsorted, which is a bug.
Expected Behavior
The result should be sorted when outer joining multiple dataframes. This test should be passed for the example above:
assert a.join([b, c], how='outer').equals(a.join([b, c], how='outer', sort=True))
Installed Versions
1.5.0.dev0+471.g8e0baa2360
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (5 by maintainers)
Top Results From Across the Web
BUG: outer join out of order when joining multiple DataFrames ...
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, ...
Read more >pandas three-way joining multiple dataframes on columns
The join() function in pandas specifies that I need a multiindex, but I'm confused about what a hierarchical indexing scheme has to do...
Read more >Merge, join, concatenate and compare - Pandas
When gluing together multiple DataFrames, you have a choice of how to handle the other axes (other than the one being concatenated). This...
Read more >Learn to Merge and Join DataFrames with Pandas and Python
“Merging” two datasets is the process of bringing two datasets together into one, and aligning the rows from each based on common attributes...
Read more >Pandas Join Explained With Examples
When we apply outer join on our DataFrames, It gets all rows from left and right but it assigns NaN to left and...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sure. I’ll be glad to work on it.
Got it. Thanks!