BUG: Assigning dictionary to series using .loc produces random results.
See original GitHub issuePandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df = pd.DataFrame({'A': [3, 9, 3, 4], 'B': [7, 1, 6, 0], 'C': [9, 0, 3, 4], 'D': [1, 8, 0, 0]})
A B C D
0 3 7 9 1
1 9 1 0 8
2 3 6 3 0
3 4 0 4 0
d = {0:10,1:20,2:30,3:40}
df.loc[:,'A'] = d
Output:
A B C D
0 0 7 9 1
1 1 1 0 8
2 2 6 3 0
3 3 0 4 0
Issue Description
The values that are assigned to columns A are the keys of the dictionary instead of the values.
If however, instead of assigning the dictionary to an existing column, if we create a new column, we will get the same result the first time we run it, but running the same code again will get the expected result. We then are able to select any column and it will return the expected output.
First time running df.loc[:,'E'] = {0:10,1:20,2:30,3:40}
Output:
A B C D E
0 0 7 9 1 0
1 1 1 0 8 1
2 2 6 3 0 2
3 3 0 4 0 3
Second time running df.loc[:,'E'] = {0:10,1:20,2:30,3:40}
A B C D E
0 0 7 9 1 10
1 1 1 0 8 20
2 2 6 3 0 30
3 3 0 4 0 40
Then if we run the same code as we did at first, we get a different result:
df.loc[:,'A'] = {0:10,1:20,2:30,3:40}
Output:
A B C D E
0 10 7 9 1 10
1 20 1 0 8 20
2 30 6 3 0 30
3 40 0 4 0 40
Expected Behavior
d = {0:10,1:20,2:30,3:40}
df.loc[:,'A'] = d
Output:
A B C D
0 10 7 9 1
1 20 1 0 8
2 30 6 3 0
3 40 0 4 0
Installed Versions
1.4.2
Issue Analytics
- State:
- Created a year ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Had a chance to take a look. This is a bug, because aligning the dict for the column case was simply forgotten, additionally, the multi-block case already works:
returns
The point with assigning dicts to rows is that the keys are matched with the column names. I am not sure if this makes much sense for columns, because you could simply use a Series