question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Resetting Index on slice

See original GitHub issue

Code Sample, a copy-pastable example if possible

# Your code here
df = data_index[data_index['ntorsions'] == 2]

Problem description

When slicing a dataframe, the index is not reset by default. This becomes an issue if you want to output that dataframe, combine that dataframe with other dataframes (good luck with that), or output the dataframe without two index columns.

Fixing this will not break code in the wild.

Expected Output

Index being correct - without the need to manually call reset_index over and over again. This is much more intuitive to end users.

-> At end of slice, call reset_index(drop = True) on the returned dataframe or current dataframe if you are slicing in-place.

Output of pd.show_versions()

loaded rc file /Users/jadolfbr/.matplotlib/matplotlibrc matplotlib version 1.5.1 verbose.level helpful interactive is False platform is darwin

INSTALLED VERSIONS

commit: None python: 2.7.10.final.0 python-bits: 64 OS: Darwin OS-release: 14.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8

pandas: 0.18.1 nose: 1.3.7 pip: 9.0.1 setuptools: 20.3.1 Cython: None numpy: 1.11.1 scipy: 0.13.0b1 statsmodels: 0.6.1 xarray: None IPython: 4.1.2 sphinx: None patsy: 0.4.0 dateutil: 2.5.3 pytz: 2016.4 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8 boto: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jrebackcommented, Apr 6, 2017

you are not using pandas power at all. you are in fact making a big assumption that the data that you are dividing is exactly the same length and perfectly lines up. maybe that’s always true for you.

I would probably do something like this. In fact this is quite general and deals with missing labeled data.

In [34]: df = DataFrame({'l': [1, 1, 1, 2, 2, 2], 'obs': [1, 2, 3, 1, 2, 3], 'value': [1, 2, 3, 4, 5, 6]})

In [35]: df
Out[35]: 
   l  obs  value
0  1    1      1
1  1    2      2
2  1    3      3
3  2    1      4
4  2    2      5
5  2    3      6

In [36]: df = df.set_index(['l', 'obs'])

In [37]: df
Out[37]: 
       value
l obs       
1 1        1
  2        2
  3        3
2 1        4
  2        5
  3        6

In [38]: df.value.loc[1] / df.value.loc[2]
Out[38]: 
obs
1    0.25
2    0.40
3    0.50
Name: value, dtype: float64

This is a slightly different and IMHO better way of organizing things.

In [39]: df.unstack()
Out[39]: 
    value      
obs     1  2  3
l              
1       1  2  3
2       4  5  6

In [40]: u = df.unstack()

In [41]: u.loc[1] / u.loc[2]
Out[41]: 
       obs
value  1      0.25
       2      0.40
       3      0.50
0reactions
jadolfbrcommented, Apr 6, 2017

Thanks for the suggestion. Yes, this seems much better than what I was trying to do - use the indexes instead of fighting with them and trying to go around them. Makes sense. I guess this would make joining a whole lot more straightforward too. Awesome. Thanks for taking the time to write back.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.DataFrame.reset_index — pandas 1.5.2 documentation
Reset the index of the DataFrame, and use the default one instead. If the DataFrame has a MultiIndex, this method can remove one...
Read more >
Pandas reset index - How to reset the index and convert the ...
pandas.reset_index in pandas is used to reset index of the dataframe object to default indexing (0 to number of rows minus 1) or...
Read more >
Reset index in pandas DataFrame - PYnative
1. We can create a DataFrame from a CSV file or dict. 2. When we manipulate the DataFrame like drop duplicates or sort...
Read more >
How to Reset an Index in Pandas DataFrame - Data to Fish
Steps to Reset an Index in Pandas DataFrame · Step 1: Gather your data · Step 2: Create a DataFrame · Step 3:...
Read more >
How do I reset a MultiIndex after slicing - python - Stack Overflow
In this case, I'm slicing a dataframe using pd.IndexSlice and referring to the resulting dataframe's index. The problem is that the resulting ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found