New behaviour regarding inplace values setting with iloc
See original GitHub issueIn scikit-learn, when testing the pandas nightly build, we got a FutureWarning
related to the following deprecation:
We have 2 related questions regarding this deprecation. First, it seems that we cannot reproduce the “Old Behaviour” with the latest available release:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: pd.__version__
Out[3]: '1.4.2'
In [4]: values = np.arange(4).reshape(2, 2)
...:
...: df = pd.DataFrame(values)
...:
...: ser = df[0]
In [5]: df.iloc[:, 0] = np.array([10, 11])
In [6]: ser
Out[6]:
0 10
1 11
Name: 0, dtype: int64
Is there a reason for not spotting the behaviour shown in the documentation?
The second question (actually it is more a comment to open a discussion) concerns the proposed fix.
It is proposed to use df[df.columns[i]] = newvals
instead of the df.iloc[:, i] = newvals
.
I personally find this way a bit counterintuitive since the SettingWithCopyWarning
proposes to change to df.loc[rows, cols]
instead of df[cols][rows]
to get the inplace behaviour.
If we consider that both approaches intend for an inplace change, the patterns used for “by position” (i.e. .iloc
) or “by label” (i.e. .loc
) are really different.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:10 (10 by maintainers)
Top GitHub Comments
Indeed, in this case, there is an “enlargement” of the dataframe when setting the values, and that can never be done inplace, and so we should avoid the warning in that case.
Indeed.