Inconsistency between the types of Dataset.dims and DataArray.dims
See original GitHub issueDataArray.dims is currently a tuple, whereas Dataset.dims is a dict. This results in ugly code like this, taken from xarray/core/group.py:
try: # Dataset
expected_size = obj.dims[group_dim]
except TypeError: # DataArray
expected_size = obj.shape[obj.get_axis_num(group_dim)]
One way to resolve this inconsistency would be switch DataArray.dims to a (frozen) OrderedDict. The downside is that code like x.dims[0] wouldn’t work anymore (unless we add some terrible fallback logic). You’d have to write x.dims.keys()[0], which is pretty ugly. On the plus side, x.dims['time'] would always return the size of the time dimension, regardless of whether x is a DataArray or Dataset.
Another option would be to add an attribute dim_shape (or some other name) that serves as an alias to dims on Dataset and an alias to OrderedDict(zip(dims, shape)) on DataArray. This would be fully backwards compatible, but further pollute the namespace on Dataset and DataArray.
Issue Analytics
- State:
- Created 7 years ago
- Comments:10 (6 by maintainers)

Top Related StackOverflow Question
With optional indexes (#1017) meaning that
ds.coords[...].sizemay not always succeed (OK, depending on some implementation choices), the need for a consistent way to look up dimension sizes becomes even more severe.Rather than resolving the type inconsistency of
dims, we could simply add a new attributesizesto bothDatasetandDataArraythat is exactly equivalent toDataset.dims. TheDatasetAPI ends up a little bit redundant, but we have a consistent way to access this information.We have
.sizesnow.