question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Indexes still include values that have been deleted

See original GitHub issue

Using pandas 0.10. If we create a Dataframe with a multi-index, then delete all the rows with value X, we’d expect the index to no longer show value X. But it does. Note the apparent inconsistency between “index” and “index.levels” – one shows the values have been deleted but the other doesn’t.

import pandas

x = pandas.DataFrame([['deleteMe',1, 9],['keepMe',2, 9],['keepMeToo',3, 9]], columns=['first','second', 'third'])
x = x.set_index(['first','second'], drop=False)

x = x[x['first'] != 'deleteMe'] #Chop off all the 'deleteMe' rows

print x.index #Good: Index no longer has any rows with 'deleteMe'. But....

print x.index.levels #Bad: index still shows the "deleteMe" values are there. But why? We deleted them.

x.groupby(level='first').sum() #Bad: it's creating a dummy row for the rows we deleted!

We don’t want the deleted values to show up in that groupby. Can we eliminate them?

Issue Analytics

  • State:closed
  • Created 11 years ago
  • Comments:34 (24 by maintainers)

github_iconTop GitHub Comments

3reactions
toobazcommented, Dec 4, 2017

I think this can be closed: the default behavior is as intended, and the method MultiIndex.remove_unused_levels() has been added as a simple fix for whoever doesn’t like the default behavior.

2reactions
ghostcommented, Dec 12, 2013

The pandas API doesn’t fit in my head anymore. For reference df.index.get_level_values might be relevent for whatever use case this was a problem for. DOes the right thing.

    ...: 
    ...: x = pandas.DataFrame([['deleteMe',1, 9],['keepMe',2, 9],['keepMeToo',3, 9]], columns=['first','second', 'third'])
    ...: x = x.set_index(['first','second'], drop=False)
    ...: 
    ...: print x.index.get_level_values(0)
    ...: x = x[x['first'] != 'deleteMe'] #Chop off all the 'deleteMe' rows
    ...: print x.index.get_level_values(0)
    ...: 
Index([u'deleteMe', u'keepMe', u'keepMeToo'], dtype='object')
Index([u'keepMe', u'keepMeToo'], dtype='object')

Read more comments on GitHub >

github_iconTop Results From Across the Web

If we delete a document.Does,the data stored in index data ...
So,know I have a doubt that when we delete a document,does the value stored for the field in internal index data structure is...
Read more >
Optimize index maintenance to improve query performance ...
This article describes index maintenance concepts, and a recommended strategy to maintain indexes.
Read more >
Indexes - Datadog Docs
Note: The deleted index will no longer accept new incoming logs. The logs in the deleted index are no longer available for querying....
Read more >
Manage search indexes | BigQuery - Google Cloud
When you no longer need a search index or want to change which columns are indexed on a table, you can delete the...
Read more >
Manage indexes in Cloud Firestore - Firebase - Google
If they're still building, the Firebase console includes a building status bar. Remove indexes. To delete an index: Go to the Cloud Firestore...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found