question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Usage Suggestions

See original GitHub issue

Just some thoughts as I was trying it out:

  1. When calling cp.ReferenceEnsemble(ds), I get
ValueError: Your decadal prediction object must contain the
            dimensions `lead` and `init` at the minimum.

Perhaps cp.ReferenceEnsemble(ds) can accept a lead and init keyword that lets the user specify those dimensions without needing to do ds = ds.rename({'target': 'lead', 'initial_time': 'init'}) Something like cp.ReferenceEnsemble(ds, lead='target', init='initial_time')

  1. When I try re.add_reference(ref_ds, 'ncep') where re = cp.ReferenceEnsemble(ds), I get
ValueError: Dimensions must match initialized
            prediction ensemble dimensions (excluding `lead` and `member`.)

Maybe if the dimension names are the same, you could try regridding it automatically and maybe raise a warning, or even just use xarray’s built-in interp method? https://xesmf.readthedocs.io/en/latest/

  1. Also, I think it would be nice to print out which dimensions are not matching in the error when you do the set(ref.dims) == set(init_dims) check because it took me a long time to realize that within that function, you renamed the dimension “init” to “time” even though the input ds initially had init dim so I also renamed my ref_ds to init.

  2. Next, I’m using a dask array; probably related more to xskillscore, but this is the error I encounter.

ValueError: dimension 'time' on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, rechunk into a single dask array chunk along this dimension, i.e., ``.rechunk({'time': -1})``, but beware that this may significantly increase memory usage.
  1. Now that I’m done with all the preprocessing, I’m not sure why I get all NaNs for the first lead. image

Anyways, some final thoughts.

  • It’d be nice to be able to compare multiple models against multiple references. My suggestion is just to use the first model to be the base, and all the other models / references datasets have to adhere to the base’s dimensions.

  • It’d also be nice to add support for target time (datetime) instead of integer leads, or leads as Timedelta objects.

  • Before v1 public/pip release, I think docs are really important!

And with that, I’m happy to help with this; just let me know what you would like help on.

Here’s the code that I used.

import os

import xarray as xr
import pandas as pd
import dask.bag as db
import climpred as cp

FCS_MODS = ['CFSv2']# , 'CMC1', 'CMC2', 'GFDL', 'GFDL_FLOR',
            # 'NASA_GEOS5v2', 'NCAR_CCSM4']
DT_RANGE = pd.date_range('2019-01-08', '2019-04-08', freq='1M')
BASE_URL = 'https://ftp.cpc.ncep.noaa.gov/NMME/realtime_anom/'

urls = [
    BASE_URL + f'{mod}/{dt:%Y%m}0800/{mod}.tmp2m.{dt:%Y%m}.anom.nc'
    for mod in FCS_MODS for dt in DT_RANGE
]

# uncomment to download data
# db.from_sequence(urls, npartitions=4).map(lambda url: os.system(f'wget -nc {url}')).compute()
# !wget -nc ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.derived/surface_gauss/air.2m.mon.mean.nc

ds = xr.concat((xr.open_mfdataset(
    f'{mod}.*.nc', concat_dim='initial_time', decode_cf=False
).assign(**{'model': mod}) for mod in FCS_MODS), 'model')
ds['initial_time'] = pd.date_range('2019-01', '2019-03', freq='1MS')
ds['target'] = pd.date_range('2019-01', periods=len(ds['target']), freq='1MS')
ds['lead'] = ('target', range(len(ds['target'])))
ds['fcst'] += 273.15

ref_ds = xr.open_dataset('air.2m.mon.mean.nc')

both_time = sorted(list(set(ds['initial_time'].values) & set(ref_ds['time'].values)))
ds = ds.sel(target=both_time).sortby('lat')
ref_ds = ref_ds.sel(time=both_time).sortby('lat')
ref_ds = ref_ds.interp(lat=ds['lat'], lon=ds['lon'])
ref_ds = ref_ds.rename({'air': 'fcst'})

ds = ds.rename({'initial_time': 'init', 'ensmem': 'member'}
               ).swap_dims({'target': 'lead'}).isel(model=0).load()

re = cp.ReferenceEnsemble(ds)

re.add_reference(ref_ds, 'ncep')

# import hvplot.xarray
# re.compute_metric('ncep', metric='rmse').hvplot('lon', 'lat')

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
bradyrxcommented, May 7, 2019

It’s not that there’s NaNs in the grid itself (rmse or something like that just puts NaN if there’s a NaN on either thing being compared). It’s that there’s fully blank slices after post-processing.

ds.fcst.plot(col='init', row='lead')

Screen Shot 2019-05-07 at 5 22 28 PM

Decomposing compute_reference to a few simple commands:

from climpred.comparisons import get_comparison_function
from climpred.metrics import get_metric_function
from climpred.prediction import _shift
comparison = get_comparison_function('e2r')
forecast, reference = comparison(ds, ref_ds)
metric = get_metric_function('rmse')
from climpred.prediction import _shift
plag = []
i = 0
a, b = _shift(forecast.isel(lead=i).rename({'init': 'time'}), reference, i, dim='time')
# initialized ensemble at lead-1 (zero-index)
a.fcst.plot(col='time')
# reference shifted to lead-1
b.fcst.plot(col='time')

Screen Shot 2019-05-07 at 5 30 54 PM

Screen Shot 2019-05-07 at 5 30 57 PM

So computing an RMSE on the above is comparing some NaN slices to data slices and breaks the RMSE. re.compute_metric(metric='rmse') plots fine for .isel(lead=1) and .isel(lead=2).

Thoughts:

  1. This is another area we need clear error reporting. If a compute function detects NaN slices it should throw a warning.
  2. The way the first figure looks, it seems planned. So this might be a miscommunication of lead vs. init vs. time from @aaronspring and I.
0reactions
ahuang11commented, Jun 12, 2019
Read more comments on GitHub >

github_iconTop Results From Across the Web

Suggestions - Grammar - Cambridge Dictionary
If we make a suggestion, it means that we mention a possible course of action to someone. There are a number of expressions...
Read more >
Use suggestions in Messages - Google Assistant Help
Use suggestions in Messages. You can find and act on suggestions related to your conversation with Smart Reply and suggested actions.
Read more >
Increasing App Usage with Suggestions Based on User ...
When a user interacts with a proactive suggestion, the system continues the activity by associating the object's activity type with the sample app....
Read more >
Suggestion Definition & Meaning - Merriam-Webster
The meaning of SUGGESTION is the act or process of suggesting. How to use suggestion in a sentence.
Read more >
suggestion noun - Oxford Learner's Dictionaries
Can I make a suggestion? · Do you have any suggestions? · I would like to offer a suggestion. · He rejected my...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found