ReferenceEnsemble suggestions
See original GitHub issueI’m unfamiliar with what the difference between an uninitialized run vs an initialized run, but I’m wondering if we can generalize this to just two categories so it’s more applicable to a wider audience and also test multiple model tests against multiple references all at once (say, NAM, GFS, HRRR vs OISST and NCEP Reanalysis).
Model / Test Runs
and
Base / Verification / Observations / Reanalysis / Control Runs
So from:
Initialized Ensemble:
SST (initialization, time, member) float64 -0.01018 0.01447 ... 0.102
SSS (initialization, time, member) float64 0.0154 0.01641 ... -0.01901
FOSI:
SST (initialization) float64 -0.05523 -0.0491 0.1105 ... 0.03564 0.1673
SSS (initialization) float64 0.01213 0.005007 ... -0.008966 -0.02685
ERSST:
SST (initialization) float64 -0.06196 -0.02328 ... 0.07206 0.1659
Uninitialized:
SST (initialization) float64 ...
To:
-- Model Runs --
**NAM Ensemble**
SST (initialization, time, member) float64 -0.01018 0.01447 ... 0.102
SSS (initialization, time, member) float64 0.0154 0.01641 ... -0.01901
**NAM Deterministic**
SST (initialization, time) float64 -0.01018 0.01447 ... 0.102
SSS (initialization, time) float64 0.0154 0.01641 ... -0.01901
**GFS**
SST (initialization) float64 -0.05523 -0.0491 0.1105 ... 0.03564 0.1673
SSS (initialization) float64 0.01213 0.005007 ... -0.008966 -0.02685
**HRRR**
SST (initialization) float64 -0.06196 -0.02328 ... 0.07206 0.1659
-- References --
**NCEP Reanalysis**
SST (initialization) float64 ...
**OISST**
SST (initialization) float64 ...
I understand that I’m just budging in and this might not be the scope of this project–after all the repo is named "clim"pred
and all the models I’ve listed are weather models, but I think the eval metrics can be immediately applicable to both cases.
After looking again at the notebook, I’m also confused about the difference between a “Reference Ensemble” vs “Perfect Model Ensemble” since it just looks like the naming is different like Reference
vs Control
(although I must admit I just skimmed the notebooks). And I suppose there’s an additional bootstrap method.
Oh I now see:
How to compare predictability skill score:
As no observational data interfers with the random climate evolution of the model, we cannot use a observation-based reference.
Therefore we can compare the members with each other (m2m) or against the ensemble mean (m2e) or the control (m2c) or the ensemble mean against the control (e2c).
When to use:
you don't have a sufficiently long of observational records to use as a reference
you want to aviod biases between model climatology and reanalysis climatology
you want to aviod super sensitive reactions of biogeochemical cycles to disruptive changes in ocean physics due to assimilation
you want to delve into process understanding of predictability of a model without outside artefacts
Maybe make ReferenceEnsemble == PredictionEnsemble which allows multiple models and multiple references, but PerfectModelEnsemble to limit to one reference?
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
Closing this. Thanks for the useful discussion @ahuang11. We’re going to work on implementing monthly/seasonal compatibility for the v1 release. Look out for PR’s, etc. and let us know your thoughts.
@ahuang11, This might be the difference between climate simulations/predictions and NWP. An ‘uninitialized’ run is something like the IPCC/CMIP simulations. These models are initialized in 1850 usually, and then they freely run forward to capture the general evolution of the climate system under some prescribed forcing. There are also uninitialized ensembles (e.g., CESM Large Ensemble), where the initial conditions are slightly perturbed and then all members are integrated forward under common forcing. This gives one a confident assessment of the forced response of the climate system (ensemble mean), as well as the influence of internal climate variability (spread amongst members).
This differs from an ‘initialized’ run, which is classic NWP principles applied to an Earth System Model. As in, the ESM is re-initialized frequently to make literal forecasts, rather than climate projections.
It looks like you found the notes on this. These are two different ways of generating a climate prediction ensemble. The
ReferenceEnsemble
is like a NWP experiment. It is initialized from observations or pseudo-observations (like a reanalysis/reconstruction). A perfect-model experiment is initialized from a control run, so you can really only compare back to that control run. And there are a lot of different ways to compare the prediction ensemble back to the control run, and it relies heavily on a lot of bootstrapping methods. It’s helpful for now to separate these two, since the analyses run on them are fairly different.I appreciate all the feedback and ideas! For now, I think we are just focusing on explicit support for decadal climate prediction ensembles. The next step would be to support subseasonal-to-seasonal predictions. This won’t require a huge overhaul, but some minor edits to allow skill computation for things other than annual resolution.
I think by getting the finer-than-annual resolution code working will then implicitly allow weather models to work as well. I agree all the eval metrics can be applied to them, especially if they are pushed through the
ReferenceEnsemble
framework.