Q: DataFrame `loc` and `iloc` seem to have inconsistent negative indexing behaviors.
See original GitHub issueWith version: 0.20.3, DataFrame loc
and iloc
have inconsistent and buggy indexing behaviors.
df = pd.DataFrame([dict(idx=idx) for idx in range(10)])
print(df.loc[range(3) + range(-3, 0), 'idx'])
returns NaN for negative indices
0 0.0
1 1.0
2 2.0
-3 NaN
-2 NaN
-1 NaN
(also note that somehow the int
became float
…)
whereas
print(df.iloc[range(3) + range(-3, 0)])
returns the last raws
0 0
1 1
2 2
7 7
8 8
9 9
Additionally, loc
fails if only negative indices are passed:
df.loc[[-2, -1], 'idx']
but not if both positive and negative
df.loc[[0, -1], 'idx']
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Inconsistent behavior when inserting a set into cells using .loc ...
In second assignment, you update a cell in an existing column. Pandas has no reason to unpack anything here, and it affects the...
Read more >How to use loc and iloc for selecting data in Pandas | by B. Chen
loc is label-based, which means that you have to specify rows and columns based on their row and column labels. iloc is integer...
Read more >Source code for darts.timeseries
TimeSeries.from_times_and_values : Create from a time index and a Numpy ... get values if value_cols is None: series_df = df.loc[:, df.columns != time_col] ......
Read more >pandas2.py - Hackage
Information column is Categorical-type and takes on a value of "left_only" for observations whose merge key only appears in 'left' DataFrame, ...
Read more >Indexing and Selecting Data — pandas 0.12.0 documentation
Float indexes should be used only with caution. If you have a float indexed DataFrame and try to select using an integer, the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@kingjr Have a look at the indexing docs (http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) on the different options to index (see also that link more below under “Selection by label” and “Selection by position”)
The behaviours of
loc
andiloc
are different on purpose because they serve different goals:loc
is label based: the negative values are not present in the index labels, and hence you get missing values for that (loc returns a result once there is at least one existing label present, in the case ofdf.loc[[-2, -1], 'idx']
no existing label is present and therefore it raises)iloc
is position based: negative indices here mean ‘start to count from the end’, and therefore the shown result is perfectly as expectedThis is currently a limitation of pandas that missing values can only be represented for floats, see http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na
pls read the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label
.loc
is for label based indexing. The neg indices are not found and reindexed toNaN
. An integer index by-definition is label based indexed with.loc
..iloc
is always positional indexed.