question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Scalar slice of MultiIndex is turned to tuples

See original GitHub issue

Today I updated to v0.14 of xarray and it broke some of my code.

I tried to select one observation of the following dataset:

<xarray.Dataset>
Dimensions:       (genes: 31523, observations: 236)
Coordinates:
  * genes         (genes) object 'ENSG00000227232' ... 'ENSG00000232254'
  * observations  (observations) MultiIndex
  - individual    (observations) object 'GTEX-111YS' ... 'GTEX-ZXG5'
  - subtissue     (observations) object 'Whole_Blood' ... 'Whole_Blood'
Data variables:
    [...]

ds.isel(observations=1):

<xarray.Dataset>
Dimensions:       (genes: 31523)
Coordinates:
  * genes         (genes) object 'ENSG00000227232' ... 'ENSG00000232254'
    observations  object ('GTEX-1122O', 'Whole_Blood')
Data variables:
    [...]

As you can see, observations is now a tuple of ('GTEX-1122O', 'Whole_Blood'). However, the individual and the subtissue should be kept as coordinates.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-514.16.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1

xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
benbovycommented, Sep 15, 2021

@Hoeze this is now implemented in #5692 (stack is not yet refactored so I reproduced your example in a slightly different way):

>>> stacked.isel(observations=1)
<xarray.Dataset>
Dimensions:       (genes: 2)
Coordinates:
  * genes         (genes) <U1 'a' 'b'
    observations  object ('c', 'f')
    individuals   <U1 'c'
    subtissues    <U1 'f'
Data variables:
    test          (genes) int64 2 2

0reactions
shoyercommented, Oct 22, 2019

I think the right long-term solution for xarray is to always store separate Variable objects for MultiIndex levels, and only use the MultiIndex for proper indexing. When you index out a single value, the MultiIndex will naturally disappear and you’ll be left with a bunch of scalar coordinates, without any special case logic to handle the MultiIndex.

This looks like @crusaderky’s third option.

We’ll need to finish up the big “explicit indexes” refactor first to make this viable.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is the best way to slice a multiindex dataframe using a ...
IndexSlice from pandas import pandas as pd idx = pd.IndexSlice df.loc[idx[:, ['a', 'c']], :] # Can use 'foo' instead of : on the...
Read more >
MultiIndex / advanced indexing — pandas 1.2.0 documentation
You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list...
Read more >
How do I use the MultiIndex in pandas? - YouTube
One of the most powerful features in pandas is multi-level indexing (or "hierarchical indexing"), which allows you to add extra dimensions ...
Read more >
statgen.us/files/software/seqpower/usr/local/lib/a...
pylint: disable=W0223 from pandas.core.index import Index, MultiIndex from ... Parameters ---------- indexer : tuple, slice, scalar The indexer used to get ...
Read more >
Indexing and selecting data - Xarray
index by integer array indices In [8]: da.isel(space=0, time=slice(None, ... Indexing methods on xarray objects generally return a subset of the original ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found