question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multiindex slicing with NaNs, unexpected results

See original GitHub issue

Code Sample, a copy-pastable example if possible

import pandas as pd
df = pd.DataFrame(
    pd.np.random.rand(2, 3), 
    columns=pd.MultiIndex.from_tuples([('a', 'foo'), ('b', 'bar'), ('b', pd.np.nan)], names=['first','second'])
)
# EXPECTED slicing everything on first level
df.loc[:, (['a', 'b'])]
Out[35]: 
first          a         b          
second       foo       bar       NaN
0       0.678021  0.383672  0.074164
1       0.738492  0.992545  0.661247

# EXPECTED just slicing one value from first level
df.loc[:, (['b'])]
Out[29]: 
first          b          
second       bar       NaN
0       0.383672  0.074164
1       0.992545  0.661247

# EXPECTED slicing out b, bar
df.loc[:, (['b'], ['bar'])]
Out[33]: 
first          b
second       bar
0       0.383672
1       0.992545

# UNEXPECTED slicing out b, nan
df.loc[:, (['b'], [pd.np.nan])]
Out[36]: 
Empty DataFrame
Columns: []
Index: [0, 1]

# UNEXPECTED slicing out b, [nan, 'bar']
df.loc[:, (['b'], ['bar', pd.np.nan])]
Out[39]: 
first          b
second       bar
0       0.383672
1       0.992545

# EXPECTED slicing out b, nan without the index
df.loc[:, ('b', pd.np.nan)]
Out[37]: 
0    0.074164
1    0.661247
Name: (b, nan), dtype: float64

Problem description

When trying to slice out multiple values from a particular level including levels with a nan value, the levels with nan are not retrieved.

Expected Output

Both of these I expect to work:

df.loc[:, (['b'], ['bar', pd.np.nan])]
Out[40]: 
first          b          
second       bar       NaN
0       0.383672  0.074164
1       0.992545  0.661247

df.loc[:, (['b'], [pd.np.nan])]
Out[40]: 
first          b          
second       NaN
0       0.074164
1       0.661247

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line] INSTALLED VERSIONS

commit: None python: 2.7.15.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-327.36.3.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.22.0 pytest: 3.10.0 pip: 18.1 setuptools: 40.5.0 Cython: 0.28.5 numpy: 1.14.2 scipy: 1.0.1 pyarrow: None xarray: 0.10.9 IPython: 5.8.0 sphinx: 1.8.1 patsy: 0.5.1 dateutil: 2.7.2 pytz: 2018.7 blosc: None bottleneck: 1.2.1 tables: 3.4.4 numexpr: 2.6.7 feather: None matplotlib: 2.2.3 openpyxl: 2.5.9 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.1.2 lxml: 4.2.1 bs4: 4.6.3 html5lib: 1.0.1 sqlalchemy: 1.2.11 pymysql: None psycopg2: 2.7.5 (dt dec pq3 ext lo64) jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
phoflcommented, Jan 12, 2021

As long as no one is asigned you are free to go 😃

1reaction
theodorjucommented, Jan 12, 2021

Hi, I’d like to try and work on adding some tests for this one.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Setting values on slice of multi-index gives NaNs · Issue #10440
It doesn't seem to be a "setting a value on a copy" issue. Instead, Pandas is writing the NaNs. My current workaround is...
Read more >
python - Slicing my data frame is returning unexpected results
Slice the data by date and later use pd.melt or pd.groupby to shape them into the format you like. Or alternatively try multi-index...
Read more >
MultiIndex / Advanced Indexing — pandas 0.21.1 documentation
You can slice a multi-index by providing multiple indexers. ... objects are not intended to work on boolean indices and may return unexpected...
Read more >
How to assign values to a multi-index slice in pandas-pandas
Use multiindex slicing with pd.IndexSlice , which creates an object to more easily perform multi-index slicing. Caveats: The multi-index dataframe and the ...
Read more >
How do I use the MultiIndex in pandas? - YouTube
One of the most powerful features in pandas is multi-level indexing (or "hierarchical indexing"), which allows you to add extra dimensions ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found