question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: MultiIndex set_levels is not symmetrical with get_level_values

See original GitHub issue

Code Sample

import pandas as pd
import io

text = """
A B C
1 1 1
1 2 2
2 1 3
2 2 4
"""
df = pd.read_csv(io.StringIO(text), delimiter = ' ')
# Note the output of get_level_values is the list of all values [1, 1, 2, 2]
print(df.index.get_level_values('A').tolist())
df.set_index(['A','B'], inplace = True)
df.index.set_levels(df.index.get_level_values('A').map(lambda x: x * 2), level='A', inplace=True)
print(df.index.get_level_values('A').tolist())
# outputs [2, 2, 2, 2]

Expected Output

[2, 2, 4, 4]

# If I re-order the input as:
text = """
A B C
1 1 1
2 1 2
1 2 3
2 2 4
"""
# it works as I expected giving [2, 4, 2, 4]

Is this a bug or expected behaviour?

output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.1.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.18.1 nose: 1.3.7 pip: 8.1.1 setuptools: 20.3 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.0 statsmodels: 0.6.1 xarray: None IPython: 4.1.2 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.5.1 pytz: 2016.2 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.5 matplotlib: 1.5.1 openpyxl: 2.3.2 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: 0.8.4 lxml: 3.6.0 bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.12 pymysql: 0.7.5.None psycopg2: None jinja2: 2.8 boto: 2.39.0 pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
bkandel-picwellcommented, Sep 22, 2016

@danio Yes, I agree that exposing set_levels and set_labels but not set_level_values (like what get_level_values does) makes it harder for users to see the functionality they want. I think that should be raised as a separate issue. I’ll spend a bit of time to make sure exactly what functionality is needed and then file an issue.

0reactions
daniocommented, Sep 22, 2016

@bkandel, yes, the example there is quite convoluted but I think this is caused by the same underlying issue as #13741. I can’t see any options to change the duplicate setting of this issue, probably I don’t have the permissions?

It feels to me that MutilIndex.set_levels is exposing too much of the underlying representation of the index, and it should be replaced by a new function with an interface more like get_level_values. That’s probably not a discussion for the issue tracker though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas set_levels on MultiIndex: Level values must be unique
I want to convert the levels Pool and Class to integers without reset_index (see below). I tried using a combination of get_level_values and ......
Read more >
pandas.MultiIndex — pandas 0.19.1 documentation
Compute indexer and mask for new index given the current index. get_indexer_for (target, \*\*kwargs), guaranteed return of an indexer even when non-unique.
Read more >
pandas.Index — pandas 1.0.0rc0+132.ga4b2c8db9 ...
pandas.Index¶ ; is_unique. Return if the index has unique values. ; name. Return Index or MultiIndex name. ; nbytes. Return the number of...
Read more >
What's New — pandas 0.19.2 documentation
Bug in not propogating exceptions in parsing invalid datetimes, ... Bug in MultiIndex.set_levels where illegal level values were still set after raising an ......
Read more >
[Pandas]MultiIndex set_levels : r/learnpython - Reddit
I tried to change the values in name_2 with set_levels but that scrambles the order. ... but I am also not able to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found