Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Usage Suggestions

See original GitHub issue

Just some thoughts as I was trying it out:

When calling cp.ReferenceEnsemble(ds), I get

ValueError: Your decadal prediction object must contain the
            dimensions `lead` and `init` at the minimum.

Perhaps cp.ReferenceEnsemble(ds) can accept a lead and init keyword that lets the user specify those dimensions without needing to do ds = ds.rename({'target': 'lead', 'initial_time': 'init'}) Something like cp.ReferenceEnsemble(ds, lead='target', init='initial_time')

When I try re.add_reference(ref_ds, 'ncep') where re = cp.ReferenceEnsemble(ds), I get

ValueError: Dimensions must match initialized
            prediction ensemble dimensions (excluding `lead` and `member`.)

Maybe if the dimension names are the same, you could try regridding it automatically and maybe raise a warning, or even just use xarray’s built-in interp method? https://xesmf.readthedocs.io/en/latest/

Also, I think it would be nice to print out which dimensions are not matching in the error when you do the set(ref.dims) == set(init_dims) check because it took me a long time to realize that within that function, you renamed the dimension “init” to “time” even though the input ds initially had init dim so I also renamed my ref_ds to init.
Next, I’m using a dask array; probably related more to xskillscore, but this is the error I encounter.

ValueError: dimension 'time' on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, rechunk into a single dask array chunk along this dimension, i.e., ``.rechunk({'time': -1})``, but beware that this may significantly increase memory usage.

Now that I’m done with all the preprocessing, I’m not sure why I get all NaNs for the first lead.

Anyways, some final thoughts.

It’d be nice to be able to compare multiple models against multiple references. My suggestion is just to use the first model to be the base, and all the other models / references datasets have to adhere to the base’s dimensions.
It’d also be nice to add support for target time (datetime) instead of integer leads, or leads as Timedelta objects.
Before v1 public/pip release, I think docs are really important!

And with that, I’m happy to help with this; just let me know what you would like help on.

Here’s the code that I used.

import os

import xarray as xr
import pandas as pd
import dask.bag as db
import climpred as cp

FCS_MODS = ['CFSv2']# , 'CMC1', 'CMC2', 'GFDL', 'GFDL_FLOR',
            # 'NASA_GEOS5v2', 'NCAR_CCSM4']
DT_RANGE = pd.date_range('2019-01-08', '2019-04-08', freq='1M')
BASE_URL = 'https://ftp.cpc.ncep.noaa.gov/NMME/realtime_anom/'

urls = [
    BASE_URL + f'{mod}/{dt:%Y%m}0800/{mod}.tmp2m.{dt:%Y%m}.anom.nc'
    for mod in FCS_MODS for dt in DT_RANGE
]

# uncomment to download data
# db.from_sequence(urls, npartitions=4).map(lambda url: os.system(f'wget -nc {url}')).compute()
# !wget -nc ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.derived/surface_gauss/air.2m.mon.mean.nc

ds = xr.concat((xr.open_mfdataset(
    f'{mod}.*.nc', concat_dim='initial_time', decode_cf=False
).assign(**{'model': mod}) for mod in FCS_MODS), 'model')
ds['initial_time'] = pd.date_range('2019-01', '2019-03', freq='1MS')
ds['target'] = pd.date_range('2019-01', periods=len(ds['target']), freq='1MS')
ds['lead'] = ('target', range(len(ds['target'])))
ds['fcst'] += 273.15

ref_ds = xr.open_dataset('air.2m.mon.mean.nc')

both_time = sorted(list(set(ds['initial_time'].values) & set(ref_ds['time'].values)))
ds = ds.sel(target=both_time).sortby('lat')
ref_ds = ref_ds.sel(time=both_time).sortby('lat')
ref_ds = ref_ds.interp(lat=ds['lat'], lon=ds['lon'])
ref_ds = ref_ds.rename({'air': 'fcst'})

ds = ds.rename({'initial_time': 'init', 'ensmem': 'member'}
               ).swap_dims({'target': 'lead'}).isel(model=0).load()

re = cp.ReferenceEnsemble(ds)

re.add_reference(ref_ds, 'ncep')

# import hvplot.xarray
# re.compute_metric('ncep', metric='rmse').hvplot('lon', 'lat')

Issue Analytics

State:
Created 4 years ago
Comments:10 (2 by maintainers)

Top GitHub Comments

1reaction

bradyrxcommented, May 7, 2019

It’s not that there’s NaNs in the grid itself (rmse or something like that just puts NaN if there’s a NaN on either thing being compared). It’s that there’s fully blank slices after post-processing.

ds.fcst.plot(col='init', row='lead')

Screen Shot 2019-05-07 at 5 22 28 PM

Decomposing compute_reference to a few simple commands:

from climpred.comparisons import get_comparison_function
from climpred.metrics import get_metric_function
from climpred.prediction import _shift
comparison = get_comparison_function('e2r')
forecast, reference = comparison(ds, ref_ds)
metric = get_metric_function('rmse')
from climpred.prediction import _shift
plag = []
i = 0
a, b = _shift(forecast.isel(lead=i).rename({'init': 'time'}), reference, i, dim='time')
# initialized ensemble at lead-1 (zero-index)
a.fcst.plot(col='time')
# reference shifted to lead-1
b.fcst.plot(col='time')

Screen Shot 2019-05-07 at 5 30 54 PM

Screen Shot 2019-05-07 at 5 30 57 PM

So computing an RMSE on the above is comparing some NaN slices to data slices and breaks the RMSE. re.compute_metric(metric='rmse') plots fine for .isel(lead=1) and .isel(lead=2).

Thoughts:

This is another area we need clear error reporting. If a compute function detects NaN slices it should throw a warning.
The way the first figure looks, it seems planned. So this might be a miscommunication of lead vs. init vs. time from @aaronspring and I.

0reactions

ahuang11commented, Jun 12, 2019

Okay, still getting: https://github.com/bradyrx/climpred/issues/112 and 2. https://github.com/bradyrx/climpred/issues/183

Other than that, all good!

Top Results From Across the Web

Suggestions - Grammar - Cambridge Dictionary

If we make a suggestion, it means that we mention a possible course of action to someone. There are a number of expressions...

Use suggestions in Messages - Google Assistant Help

Use suggestions in Messages. You can find and act on suggestions related to your conversation with Smart Reply and suggested actions.

Increasing App Usage with Suggestions Based on User ...

When a user interacts with a proactive suggestion, the system continues the activity by associating the object's activity type with the sample app....

Suggestion Definition & Meaning - Merriam-Webster

The meaning of SUGGESTION is the act or process of suggesting. How to use suggestion in a sentence.

suggestion noun - Oxford Learner's Dictionaries

Can I make a suggestion? · Do you have any suggestions? · I would like to offer a suggestion. · He rejected my...