question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrame.join left_index right_index inverted

See original GitHub issue

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd

df_left = pd.DataFrame(data=['X'],columns=['C'],index=[22])
df_right = pd.DataFrame(data=['X'],columns=['C'],index=[999])
merge = pd.merge(df_left,df_right,on=['C'], left_index=True)

print merge.index

Problem description

The copied code print a DataFrame where the key is 999. As I understand from the documentation where left_index=True the keys from the left DataFrame should be used as join keys. My output: Int64Index([999], dtype=‘int64’) Expected output: Int64Index([22], dtype=‘int64’)

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-32-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.23.3 pytest: None pip: 18.0 setuptools: 20.7.0 Cython: None numpy: 1.15.0 scipy: None pyarrow: None xarray: None IPython: 5.8.0 sphinx: None patsy: None dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: 1.1.0 xlwt: None xlsxwriter: 1.0.5 lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
TCollcommented, Nov 4, 2019

I think we’re not quite on the same page here.

on allows us to specify the merge columns to use in both dataframes via one argument, in this application we’re using on instead of left_on and/or right_on.

In this situation, if left_index or right_index are included (but left_on and right_on are excluded), the behaviour mentioned in this issue occurs.

here’s an example:

import pandas as pd

age_data = {
    'Name':['Ash','Bob','Charlie'],
    'ID':[1,2,3],
    'Age':[18,80,55]
}
height_data = {
    'Name':['Ash','Charlie','Derek'],
    'ID':[1,3,4],
    'Height':[140,162,180]
}

ages = pd.DataFrame(data=age_data, index=[1,2,3])
heights = pd.DataFrame(data=height_data, index=[91,92,93])

common_columns = ['Name', 'ID']
               
common_records_left_index = pd.merge(ages, heights, how='inner', on=common_columns, left_index=True)
common_records_right_index = pd.merge(ages, heights, how='inner', on=common_columns, right_index=True)

Here, we should end up with two dataframes which both contain the combined age and height data for Ash and Charlie (as they’re the only records with both an age and a height provided), with index values as follows:

  • common_records_left_index should have index keys of 1 and 3 (preserved from the ages dataframe), and
  • common_records_right_index should have index keys of 91 and 93 (preserved from the heights dataframe)

However, the opposite case is true - left_index=True preserves the keys from the right dataframe during the merge, and right_index=True preserves the keys from the left dataframe.

0reactions
jrebackcommented, Nov 23, 2020

would take a PR for a test that replicates the OP

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas finding inverse of merge - python - Stack Overflow
I have two pandas dataframes, one that is a list of states, cities, and a capital flag with a multiIndex of (state, city),...
Read more >
pandas.DataFrame.merge — pandas 1.5.2 documentation
The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes...
Read more >
Pandas DataFrames Left/Right Join Where NULL - #18
He we will see how to do a Left Join Where is NULL and a Right Join Where is NULL with join and...
Read more >
How to LEFT ANTI join under some matching condition in ...
It only returns the columns from the left table and not the right. Method 1: Using isin(). On the created dataframes we perform...
Read more >
p.6 Data Analysis with Python and Pandas Tutorial
Joining and Merging Dataframes - p.6 Data Analysis with Python and Pandas ... So, we wound up with an index that was identical...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found