BUG: MultiIndex set_levels is not symmetrical with get_level_values
See original GitHub issueCode Sample
import pandas as pd
import io
text = """
A B C
1 1 1
1 2 2
2 1 3
2 2 4
"""
df = pd.read_csv(io.StringIO(text), delimiter = ' ')
# Note the output of get_level_values is the list of all values [1, 1, 2, 2]
print(df.index.get_level_values('A').tolist())
df.set_index(['A','B'], inplace = True)
df.index.set_levels(df.index.get_level_values('A').map(lambda x: x * 2), level='A', inplace=True)
print(df.index.get_level_values('A').tolist())
# outputs [2, 2, 2, 2]
Expected Output
[2, 2, 4, 4]
# If I re-order the input as:
text = """
A B C
1 1 1
2 1 2
1 2 3
2 2 4
"""
# it works as I expected giving [2, 4, 2, 4]
Is this a bug or expected behaviour?
output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.1.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None
pandas: 0.18.1 nose: 1.3.7 pip: 8.1.1 setuptools: 20.3 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.0 statsmodels: 0.6.1 xarray: None IPython: 4.1.2 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.5.1 pytz: 2016.2 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.5 matplotlib: 1.5.1 openpyxl: 2.3.2 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: 0.8.4 lxml: 3.6.0 bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.12 pymysql: 0.7.5.None psycopg2: None jinja2: 2.8 boto: 2.39.0 pandas_datareader: None
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Pandas set_levels on MultiIndex: Level values must be unique
I want to convert the levels Pool and Class to integers without reset_index (see below). I tried using a combination of get_level_values and ......
Read more >pandas.MultiIndex — pandas 0.19.1 documentation
Compute indexer and mask for new index given the current index. get_indexer_for (target, \*\*kwargs), guaranteed return of an indexer even when non-unique.
Read more >pandas.Index — pandas 1.0.0rc0+132.ga4b2c8db9 ...
pandas.Index¶ ; is_unique. Return if the index has unique values. ; name. Return Index or MultiIndex name. ; nbytes. Return the number of...
Read more >What's New — pandas 0.19.2 documentation
Bug in not propogating exceptions in parsing invalid datetimes, ... Bug in MultiIndex.set_levels where illegal level values were still set after raising an ......
Read more >[Pandas]MultiIndex set_levels : r/learnpython - Reddit
I tried to change the values in name_2 with set_levels but that scrambles the order. ... but I am also not able to...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@danio Yes, I agree that exposing
set_levels
andset_labels
but notset_level_values
(like whatget_level_values
does) makes it harder for users to see the functionality they want. I think that should be raised as a separate issue. I’ll spend a bit of time to make sure exactly what functionality is needed and then file an issue.@bkandel, yes, the example there is quite convoluted but I think this is caused by the same underlying issue as #13741. I can’t see any options to change the duplicate setting of this issue, probably I don’t have the permissions?
It feels to me that MutilIndex.set_levels is exposing too much of the underlying representation of the index, and it should be replaced by a new function with an interface more like get_level_values. That’s probably not a discussion for the issue tracker though.