Scalar slice of MultiIndex is turned to tuples
See original GitHub issueToday I updated to v0.14
of xarray and it broke some of my code.
I tried to select one observation of the following dataset:
<xarray.Dataset>
Dimensions: (genes: 31523, observations: 236)
Coordinates:
* genes (genes) object 'ENSG00000227232' ... 'ENSG00000232254'
* observations (observations) MultiIndex
- individual (observations) object 'GTEX-111YS' ... 'GTEX-ZXG5'
- subtissue (observations) object 'Whole_Blood' ... 'Whole_Blood'
Data variables:
[...]
ds.isel(observations=1)
:
<xarray.Dataset>
Dimensions: (genes: 31523)
Coordinates:
* genes (genes) object 'ENSG00000227232' ... 'ENSG00000232254'
observations object ('GTEX-1122O', 'Whole_Blood')
Data variables:
[...]
As you can see, observations is now a tuple of ('GTEX-1122O', 'Whole_Blood')
.
However, the individual and the subtissue should be kept as coordinates.
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.16.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.1
xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.8.0 sphinx: None
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
What is the best way to slice a multiindex dataframe using a ...
IndexSlice from pandas import pandas as pd idx = pd.IndexSlice df.loc[idx[:, ['a', 'c']], :] # Can use 'foo' instead of : on the...
Read more >MultiIndex / advanced indexing — pandas 1.2.0 documentation
You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list...
Read more >How do I use the MultiIndex in pandas? - YouTube
One of the most powerful features in pandas is multi-level indexing (or "hierarchical indexing"), which allows you to add extra dimensions ...
Read more >statgen.us/files/software/seqpower/usr/local/lib/a...
pylint: disable=W0223 from pandas.core.index import Index, MultiIndex from ... Parameters ---------- indexer : tuple, slice, scalar The indexer used to get ...
Read more >Indexing and selecting data - Xarray
index by integer array indices In [8]: da.isel(space=0, time=slice(None, ... Indexing methods on xarray objects generally return a subset of the original ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@Hoeze this is now implemented in #5692 (
stack
is not yet refactored so I reproduced your example in a slightly different way):I think the right long-term solution for xarray is to always store separate
Variable
objects for MultiIndex levels, and only use the MultiIndex for proper indexing. When you index out a single value, the MultiIndex will naturally disappear and you’ll be left with a bunch of scalar coordinates, without any special case logic to handle the MultiIndex.This looks like @crusaderky’s third option.
We’ll need to finish up the big “explicit indexes” refactor first to make this viable.