xarray.DataArray.where always returns array of float64 regardless of input dtype
See original GitHub issueMCVE Code Sample
import numpy as np import xarray as xr
a = xr.DataArray(np.arange(25).reshape(5, 5), dims=(‘x’, ‘y’)) print(a.dtype) ‘int32’ a_sub = a.where(a.x + a.y < 4) a_sub.dtype ‘float64’
Expected Output
a_sub should be an xarray of dtype int32
Problem Description
The documentation (http://xarray.pydata.org/en/stable/generated/xarray.DataArray.where.html) states that return type should be the same type as caller. However, the return type is always float64
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None python: 3.7.1 | packaged by conda-forge | (default, Mar 13 2019, 13:32:59) [MSC v.1900 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 45 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None libhdf5: 1.10.4 libnetcdf: 4.6.2
xarray: 0.13.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.22 cfgrib: None iris: None bottleneck: None dask: 2.5.2 distributed: None matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: None IPython: 7.8.0 sphinx: None
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (12 by maintainers)
Top GitHub Comments
Yes, I read the return type as the ‘same type as caller’ and at first I expected the array type to be the same. I soon realized that means a DataArray or DataSet. And for your output array to support nan values, it has to be float. My bad - sorry for the clutter.
I’m not sure that either of these is a good idea.
The problem with raising a warning is that this is well-defined behavior. It may not always be useful, but well defined but useless behavior arises all the time in programs, so it’s annoying to raise a warning for a special case.
The problem with skipping
where_method
is that now we end up with a potentially inconsistent dtype, depending on the selection. These sort of special cases can be quite frustrating to program around.