question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Q: DataFrame `loc` and `iloc` seem to have inconsistent negative indexing behaviors.

See original GitHub issue

With version: 0.20.3, DataFrame loc and iloc have inconsistent and buggy indexing behaviors.

df = pd.DataFrame([dict(idx=idx) for idx in range(10)])
print(df.loc[range(3) + range(-3, 0), 'idx'])

returns NaN for negative indices

 0    0.0
 1    1.0
 2    2.0
-3    NaN
-2    NaN
-1    NaN

(also note that somehow the int became float)

whereas

print(df.iloc[range(3) + range(-3, 0)])

returns the last raws

0    0
1    1
2    2
7    7
8    8
9    9

Additionally, loc fails if only negative indices are passed: df.loc[[-2, -1], 'idx'] but not if both positive and negative df.loc[[0, -1], 'idx']

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
jorisvandenbosschecommented, Aug 18, 2017

@kingjr Have a look at the indexing docs (http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) on the different options to index (see also that link more below under “Selection by label” and “Selection by position”)

The behaviours of loc and iloc are different on purpose because they serve different goals:

  • loc is label based: the negative values are not present in the index labels, and hence you get missing values for that (loc returns a result once there is at least one existing label present, in the case of df.loc[[-2, -1], 'idx'] no existing label is present and therefore it raises)

  • iloc is position based: negative indices here mean ‘start to count from the end’, and therefore the shown result is perfectly as expected

(also note that somehow the int became float…)

This is currently a limitation of pandas that missing values can only be represented for floats, see http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

2reactions
jrebackcommented, Aug 18, 2017

pls read the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label

.loc is for label based indexing. The neg indices are not found and reindexed to NaN. An integer index by-definition is label based indexed with .loc.

.iloc is always positional indexed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Inconsistent behavior when inserting a set into cells using .loc ...
In second assignment, you update a cell in an existing column. Pandas has no reason to unpack anything here, and it affects the...
Read more >
How to use loc and iloc for selecting data in Pandas | by B. Chen
loc is label-based, which means that you have to specify rows and columns based on their row and column labels. iloc is integer...
Read more >
Source code for darts.timeseries
TimeSeries.from_times_and_values : Create from a time index and a Numpy ... get values if value_cols is None: series_df = df.loc[:, df.columns != time_col] ......
Read more >
pandas2.py - Hackage
Information column is Categorical-type and takes on a value of "left_only" for observations whose merge key only appears in 'left' DataFrame, ...
Read more >
Indexing and Selecting Data — pandas 0.12.0 documentation
Float indexes should be used only with caution. If you have a float indexed DataFrame and try to select using an integer, the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found