DataArray.set_index throws error on documented input
See original GitHub issueProblem Description
Docs for DataArray.set_index
describe the main indexes
argument as:
Mapping from names matching dimensions and values given by (lists of) the names of existing coordinates or variables to set as new (multi-)index.
This suggests that one can set a DataArray
instance’s coordinates by passing in a dimension and a list-like object of coordinates.
MCVE
In [1]: import numpy as np
In [2]: import xarray as xr
In [3]: arr = xr.DataArray(data=np.ones((2, 3)), dims=['x', 'y'])
In [4]: arr.dims
Out[4]: ('x', 'y')
In [5]: arr.set_index({'x': range(2)})
KeyError
...
144 for n in var_names:
--> 145 var = variables[n]
146 if (current_index_variable is not None and
147 var.dims != current_index_variable.dims):
KeyError: 0
At first, I thought it might be because coords
and _coords
were not being set in this case:
In [18]: arr.coords
Out[18]:
Coordinates:
*empty*
In [19]: arr._coords
Out[19]: OrderedDict()
but even if I set the coordinates first and then try to re-index, it fails:
In [20]: arr = xr.DataArray(data=np.ones((2, 3)), dims=['x', 'y'], coords={'x': range(2), 'y': range(3)})
In [21]: arr.set_index({'x': ['a', 'b', 'c']})
...
144 for n in var_names:
--> 145 var = variables[n]
146 if (current_index_variable is not None and
147 var.dims != current_index_variable.dims):
Expected Output
I expect my MCVE to work based on the documentation.
Problem Solution
My guess is that the issue is Xarray is using the merge_indexes
function (see here) from the Dataset
module, and there is no concept of a variable
in a DataArray
.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Indexing and selecting data - Xarray
The most basic way to access elements of a DataArray object is to use Python's [] syntax, such as array[i, j] , where...
Read more >xarray.DataArray
Returns a new DataArray with duplicate dimension values removed. drop_indexes (coord_names, *[, errors]). Drop the indexes assigned to the given coordinates.
Read more >API reference — xarray 0.10.3 documentation
Given any number of Dataset and/or DataArray objects, returns new objects with aligned indexes and dimension sizes. broadcast (*args, **kwargs), Explicitly ...
Read more >API reference - Xarray
Returns an array with dropped variables. DataArray.drop_indexes (coord_names, *[, errors]). Drop the indexes assigned to the given coordinates.
Read more >Parallel computing with Dask - Xarray
For more details on Dask, read its documentation. ... this is not possible, they will raise an exception rather than unexpectedly loading data...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Looks like the idea of a glossary is already being discussed in https://github.com/pydata/xarray/issues/2410.
Good work on finding that issue. I think even if we can get something brief in, that would be helpful.
On the specific definitions:
For me ‘dimension’ has a precise definition from traditional sciences, so having our ‘coordinate’ be an additional / auxiliary / alternative dimension wouldn’t be consistent with that (e.g. a 4-dimensional array would still be 4 dimensional regardless of how many coordinates it had).
👍