Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

seasonal forecast valid-time inconsistency

See original GitHub issue

Hi,

We have been looking at the xarrays and netCDF files produced by cfgrib for the seasonal forecast data. The valid_time recovered using cfgrib is not be exactly consistent with the time stamp recovered by the eccodes grib_to_netcdf method and it may be the case that users are misinterpreting the output they receive.

Using cfgrib, the value of valid_time is the time at the end of the time aggregation window. I think that cfgrib must calculate this as the sum of the time and step. Using ecCodes, the time outputs is at the start of the aggregation window, which for step index n would be the sum of time and step(n-1).

I am bringing this to you attention so that you can decide the best way to deal with it, but it would be very useful (CDS-toolbox) for us to have access to the start of the time aggregation window in the xarray that is recovered. Otherwise we will have to build some sort of work around in the toolbox. One option could be to output the time bounds of the aggregation period, with an additional dimension of length 2.

Additionally, I think that it is probably necessary to include something in the metadata that the valid_time represents the time at the end of the aggregation window as this may not be what users were expecting.

Please let me know if you need any more information about what I am trying to explain. As an example case use the grib file retrieved with the API request below. The time stamps are different when converted using the grib_to_netcdf and cfgrib approaches.

import cdsapi

c = cdsapi.Client()

c.retrieve(
    'seasonal-monthly-single-levels',
    {
        'originating_centre':'ecmwf',
        'system':'5',
        'variable':'2m_dewpoint_temperature',
        'product_type':'monthly_mean',
        'year':'2017',
        'month':[
            '01','02'
        ],
        'leadtime_month':[
            '1','2'
        ],
        'format':'grib'
    },
    'download.grib')

Issue Analytics

State:
Created 4 years ago
Comments:14

Top GitHub Comments

1reaction

edupenabadcommented, Nov 17, 2019

@alexamici I can not more than sympathise with your feeling of confusion as this really highlights how important it is to get this done properly to avoid that even highly skilled users with a more than decent familiarity with the files interpret them wrongly.

I will express a bit more bluntly what I said in my previous comment. It is not that I doubt about it, it is that I am convinced that issue #38 was solved with the feature added in 0.9.7.3 while issue #97 remains unsolved.

I’ll try to elaborate on the problem with a couple of additional examples that I hope will showcase more concisely its implications.

If we explore some of the keywords related to time coordinates in a SEAS5 monthly means file:

$ grib_ls -p date,time,validityDate,validityTime,forecastMonth,verifyingMonth seas5_monthlymeans.grib 
seas5_monthlymeans.grib
date            time            validityDate    validityTime    forecastMonth   verifyingMonth  
20191101        0               20191201        0               1               201911         
20191101        0               20200101        0               2               201912         
20191101        0               20200201        0               3               202001         
20191101        0               20200301        0               4               202002         
20191101        0               20200401        0               5               202003         
20191101        0               20200501        0               6               202004         
6 of 6 messages in seas5_monthlymeans.grib

6 of 6 total messages in 1 files

Note the discrepancy between verifyingMonth (the keyword in local definitions table 16 I mentioned in my first comment) and validityDate/validityTime (the keywords used by cfgrib to populate valid_time coordinate, which are not keywords physically stored in the GRIB header but computed keywords). In a more-than-educated guess I am inclined to say the latter are computed as the sum of date/time + step

If we use grib_to_netcdf to convert that same file to netcdf:

$ grib_to_netcdf -o seas5_monthlymeans.nc seas5_monthlymeans.grib 
grib_to_netcdf: Version 2.14.1
grib_to_netcdf: Processing input file 'seas5_monthlymeans.grib'.
grib_to_netcdf: Found 6 GRIB fields in 1 file.
grib_to_netcdf: Ignoring key(s): method, type, stream, refdate, hdate
grib_to_netcdf: Creating netCDF file 'seas5_monthlymeans.nc'
grib_to_netcdf: NetCDF library version: 4.6.3 of May  8 2019 00:09:03 $
grib_to_netcdf: Creating large (64 bit) file format.
grib_to_netcdf: Defining variable 't2m'.
grib_to_netcdf: Done.

$ python -c "import netCDF4 as nc; ds=nc.Dataset('seas5_monthlymeans.nc','r'); aa=nc.num2date(ds.variables['time'][:],ds.variables['time'].units); print (aa)"
[datetime.datetime(2019, 11, 1, 0, 0) datetime.datetime(2019, 12, 1, 0, 0)
 datetime.datetime(2020, 1, 1, 0, 0) datetime.datetime(2020, 2, 1, 0, 0)
 datetime.datetime(2020, 3, 1, 0, 0) datetime.datetime(2020, 4, 1, 0, 0)]

As you can see, grib_to_netcdf is able to understand that the GRIB file has been written using local definitions (specifically table 16), and therefore it populates the time coordinate with the right values, i.e. Nov/2019 is the sensible value (label) for forecastMonth=1 in a monthly means file with nominal start date 20191101 and not Dec/2019 as it might be interpreted from the computed values validityDate/validityTime. These last sentences are not a guess, an interpretation or an expression of a personal preference, it is an explanation on how seasonal forecast monthly means GRIB files have been consistently encoded for a long time at ECMWF.

Finally, regarding your thinking about where to point within an interval as its label, as you know, the encoding of an aggregation across a coordinate is solved in CF with a proper use of ‘Cell Boundaries’, which unequivocally defines the interval of aggregation. Having that complete definition of the interval, the election of a given value within the interval as its label, it is in my opinion, more a question of uses cases than of personal preferences, and we can easily find reasonable use cases for the beginning, the middle or the end of the interval.

I hope those points contribute to disentangle the confusion and help to find the more suitable solution to this issue.

0reactions

Tinkaacommented, Dec 7, 2021

Hi! I was almost trusting the valid_time parameter to actually represent the forecasted month till I was double-checking and found this issue. It is now working like a charm using verifying_time and forecastMonth, so thanks for that! I was wondering if the what and why of this solution is documented somewhere? Then I can avoid the mistake in the future and share it with others. Thanks!

Top Results From Across the Web

The Impact of Weather Forecast Inconsistency on User Trust

For high-impact weather events, forecasts often start days in advance. Forecasters believe that consistency among subsequent forecasts is ...

Evaluation of the Consistency of ECMWF Ensemble Forecasts

The forecasts are valid for lead times of 1 to 15 days (at 24-hr ... while forecasts for NAO were not unusually inconsistent...

Best practices for dealing with shifting, inconsistent seasonality

Best practices for dealing with shifting, inconsistent seasonality · One is to only use exact calendar equivalent years to parcel out seasonality ......

Advances and challenges of operational seasonal ...

PICASO seasonal forecasts are verified and compared with the APCC PMME precipitation forecasts from the model grid closest to each station ...

Guidance on Verification of Operational Seasonal Climate ...

– A forecast is “consistent” if it is a true indication of what the forecaster thinks is going to happen. In the context...