assign values from `xr.groupby_bins` to new `variable`
See original GitHub issueCode Sample, a copy-pastable example if possible
A “Minimal, Complete and Verifiable Example” will make it much easier for maintainers to help you: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
# Your code here
import pandas as pd
import numpy as np
import xarray as xr
time = pd.date_range('2010-01-01','2011-12-31',freq='M')
lat = np.linspace(-5.175003, -4.7250023, 10)
lon = np.linspace(33.524994, 33.97499, 10)
precip = np.random.normal(0, 1, size=(len(time), len(lat), len(lon)))
ds = xr.Dataset(
{'precip': (['time', 'lat', 'lon'], precip)},
coords={
'lon': lon,
'lat': lat,
'time': time,
}
)
variable = 'precip'
# calculate a cumsum over some window size
rolling_window = 3
ds_window = (
ds.rolling(time=rolling_window, center=True)
.sum()
.dropna(dim='time', how='all')
)
# construct a cumulative frequency distribution ranking the precip values
# per month
rank_norm_list = []
for mth in range(1, 13):
ds_mth = (
ds_window
.where(ds_window['time.month'] == mth)
.dropna(dim='time', how='all')
)
rank_norm_mth = (
(ds_mth.rank(dim='time') - 1) / (ds_mth.time.size - 1.0) * 100.0
)
rank_norm_mth = rank_norm_mth.rename({variable: 'rank_norm'})
rank_norm_list.append(rank_norm_mth)
rank_norm = xr.merge(rank_norm_list).sortby('time')
# assign bins to variable xarray
bins = [20., 40., 60., 80., np.Inf]
decile_index_gpby = rank_norm.groupby_bins('rank_norm', bins=bins)
out = decile_index_gpby.assign() # assign_coords()
Problem description
[this should explain why the current behavior is a problem and why the expected output is a better solution.]
I want to calculate the Decile Index - see the ex1-Calculate Decile Index (DI) with Python.ipynb
.
The pandas
implementation is simple enough but I need help with applying the bin labels to a new variable
/ coordinate
.
Expected Output
<xarray.Dataset>
Dimensions: (lat: 10, lon: 10, time: 24)
Coordinates:
* time (time) datetime64[ns] 2010-01-31 2010-02-28 ... 2011-12-31
* lat (lat) float32 -5.175003 -5.125 -5.075001 ... -4.7750015 -4.7250023
* lon (lon) float32 33.524994 33.574997 33.625 ... 33.925003 33.97499
Data variables:
precip (time, lat, lon) float32 4.6461554 4.790813 ... 7.3063064 7.535994
rank_bin (lat, lon, time) int64 1 3 3 0 1 4 2 3 0 1 ... 0 4 0 1 3 1 2 2 3 1
Output of xr.show_versions()
xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.4 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.17 cfgrib: 0.9.7 iris: None bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.1 matplotlib: 3.1.0 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1 conda: None pytest: 4.5.0 IPython: 7.1.1 sphinx: 2.0.1
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (3 by maintainers)
Top GitHub Comments
If you just want different coordinates for the result of
groupby_bins
, you can pass thelabels
keyword. See example here: http://xarray.pydata.org/en/stable/groupby.html#binningPerfect thankyou!