BUG: should constructing Index from a Series make a copy?
See original GitHub issueFrom a comment of @jbrockmendel at https://github.com/pandas-dev/pandas/pull/41878#issuecomment-881557871:
ser = pd.Series(range(5))
idx = pd.Index(ser)
ser[0] = 10
>>> idx[0]
10
In the above, we create an Index from a Series, then mutate the Series, which also updated the Index, while an Index is assumed to be immutable.
Changing the example a bit, you can obtain wrong values with indexing this way:
ser = pd.Series(range(5))
idx = pd.Index(ser)
ser.index = idx
>>> ser[0]
0
>>> ser.iloc[0] = 10
>>> ser[0]
10
>>> ser
10 10
1 1
2 2
3 3
4 4
dtype: int64
So ser[0]
is still giving a result, while that key doesn’t actually exist in the Series’ index at that point.
I know that generally we consider this a user error if you would do this with a numpy array (idx = pd.Index(arr)
and mutating the array), but here you get that by only using high-level pandas objects itself. In which case we should prevent this from happening?
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (6 by maintainers)
Top Results From Across the Web
pandas.series.copy doesn't create new object - Stack Overflow
The index remains shared by construction. If you want a deeper copy, then just use directly the Series constructor: series = pd.
Read more >SettingwithCopyWarning: How to Fix This Warning in Pandas
Either a view or a copy could be returned when you index a pandas data structure, which means get operations on a DataFrame...
Read more >pandas.Series.copy — pandas 1.5.2 documentation
Make a copy of this object's indices and data. When deep=True (default), a new object will be created with a copy of the...
Read more >Python Pandas - Series - Tutorialspoint
A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted order...
Read more >Creating pandas series from dictionary - Machine Learning Plus
You can directly specify the order of the indices as a list, and the series will be made accordingly. # Ordering a series...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@rhshadrach OK, thanks, I’ll try to check that if I ever manage to get the segmentation faults reproduced. Let me know if I can be of further help.
BTW, the other way around, when constructing a Series from an Index, we do take a copy to avoid such issues. In
Series.__init__
:https://github.com/pandas-dev/pandas/blob/57d8d3a7cc2c4afc8746bf774b5062fa70c0f5fd/pandas/core/series.py#L401-L409