Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Inconsistency between the types of Dataset.dims and DataArray.dims

See original GitHub issue

DataArray.dims is currently a tuple, whereas Dataset.dims is a dict. This results in ugly code like this, taken from xarray/core/group.py:

        try:  # Dataset
            expected_size = obj.dims[group_dim]
        except TypeError:  # DataArray
            expected_size = obj.shape[obj.get_axis_num(group_dim)]

One way to resolve this inconsistency would be switch DataArray.dims to a (frozen) OrderedDict. The downside is that code like x.dims[0] wouldn’t work anymore (unless we add some terrible fallback logic). You’d have to write x.dims.keys()[0], which is pretty ugly. On the plus side, x.dims['time'] would always return the size of the time dimension, regardless of whether x is a DataArray or Dataset.

Another option would be to add an attribute dim_shape (or some other name) that serves as an alias to dims on Dataset and an alias to OrderedDict(zip(dims, shape)) on DataArray. This would be fully backwards compatible, but further pollute the namespace on Dataset and DataArray.

Issue Analytics

State:
Created 7 years ago
Comments:10 (6 by maintainers)

Top GitHub Comments

1reaction

shoyercommented, Oct 22, 2016

With optional indexes (#1017) meaning that ds.coords[...].size may not always succeed (OK, depending on some implementation choices), the need for a consistent way to look up dimension sizes becomes even more severe.

Rather than resolving the type inconsistency of dims, we could simply add a new attribute sizes to both Dataset and DataArray that is exactly equivalent to Dataset.dims. The Dataset API ends up a little bit redundant, but we have a consistent way to access this information.

0reactions

shoyercommented, Jan 25, 2019

We have .sizes now.

Top Results From Across the Web

xarray.DataArray.dims

Tuple of dimension names associated with this array. Note that the type of this property is inconsistent with Dataset.dims . See Dataset.sizes and...

Data Structures - xarray - Read the Docs

xarray.DataArray is xarray's implementation of a labeled, multi-dimensional array. It has several key properties: ... xarray uses dims and coords to enable its ......

Source code for arviz.data.inference_data

Dataset ]): """Container for inference data storage using xarray. ... dataarray in dataset.items(): data[var_name] = dataarray.values dims = [] for ...

Subtract two xarrays while keeping all dimensions

You likely have a disagreement between elements of your plev coordinate, ... DataArray(arr2, dims=['plev', 'lat'], coords=[pressures + 1e-8, ...

Xarray for raster data (DEMs) with inconsistent spatial extent

It's not intuitive to teach yourself (e.g. a DataArray automatica… ... Ultimately, I was hoping to have an xarray with time, x, and...