Set one-dimensional data variable as dimension coordinate?
See original GitHub issueCode Sample
I have this dataset, and I’d like to make it indexable by time:
<xarray.Dataset>
Dimensions: (station_observations: 46862)
Dimensions without coordinates: station_observations
Data variables:
time (station_observations) datetime64[ns] ...
SNOW_ON_THE_GROUND (station_observations) float64 ...
ONE_DAY_SNOW (station_observations) float64 ...
ONE_DAY_RAIN (station_observations) float64 ...
ONE_DAY_PRECIPITATION (station_observations) float64 ...
MIN_TEMP (station_observations) float64 ...
MAX_TEMP (station_observations) float64 ...
Attributes:
elevation: 15.0
Problem description
I expected to be able to use ds.set_coords
to make the time
variable an indexable coordinate. The variable IS converted to a coordinate, but it is not a dimension coordinate, so I can’t index with it. I can use assign_coords(station_observations=ds.time)
to make station_observations indexable by time, but then the name in semantically wrong, and the time
variable still exists, which makes the code harder to maintain.
Expected Output
ds.set_coords('time', inplace=True)
<xarray.Dataset>
Dimensions: (station_observations: 46862)
Coordinates:
time (station_observations) datetime64[ns] ...
Dimensions without coordinates: station_observations
Data variables:
SNOW_ON_THE_GROUND (station_observations) float64 ...
ONE_DAY_SNOW (station_observations) float64 ...
ONE_DAY_RAIN (station_observations) float64 ...
ONE_DAY_PRECIPITATION (station_observations) float64 ...
MIN_TEMP (station_observations) float64 ...
MAX_TEMP (station_observations) float64 ...
Attributes:
elevation: 15.0
In [95]: ds.sel(time='1896')
ValueError: dimensions or multi-index levels ['time'] do not exist
with assign_coords:
In [97]: ds=ds.assign_coords(station_observations=ds.time)
In [98]: ds.sel(station_observations='1896')
Out[98]:
<xarray.Dataset>
Dimensions: (station_observations: 366)
Coordinates:
* station_observations (station_observations) datetime64[ns] 1896-01-01 ...
Data variables:
time (station_observations) datetime64[ns] ...
SNOW_ON_THE_GROUND (station_observations) float64 ...
ONE_DAY_SNOW (station_observations) float64 ...
ONE_DAY_RAIN (station_observations) float64 ...
ONE_DAY_PRECIPITATION (station_observations) float64 ...
MIN_TEMP (station_observations) float64 ...
MAX_TEMP (station_observations) float64 ...
Attributes:
elevation: 15.0
works correctly, but looks ugly. It would be nice if the time variable could be assigned as a dimension directly. I can drop the time variable and rename the station_observations, but it’s a little annoying to do so.
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 4.16.0-041600-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8
xarray: 0.10.2 pandas: 0.22.0 numpy: 1.13.3 scipy: 0.19.1 netCDF4: 1.3.1 h5netcdf: None h5py: None Nio: None zarr: None bottleneck: 1.2.0 cyordereddict: None dask: 0.16.0 distributed: None matplotlib: 2.1.1 cartopy: None seaborn: None setuptools: 39.0.1 pip: 9.0.1 conda: None pytest: None IPython: 5.5.0 sphinx: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (6 by maintainers)
to get your example to work, use this:
to get both as dimensions, use
Hi @nedclimaterisk. Thanks for the raising an issue.
In that case, you can use
swap_dims
,