Adding CDL Parser/`open_cdl`?
See original GitHub issueIs your feature request related to a problem?
No.
Describe the solution you’d like
It would be nice to load/generate xarray datasets from Common Data Language (CDL) descriptions. CDL is a DSL that that defines a netCDF dataset, and is quite nice for testing. We use it to build mock datasets for e.g. integration testing of plotting routines/complex data analysis etc. CDL provides a concise format for storing the schema of this data. This schema can be used for validation or generation (using the CLI ncgen
).
CDL is basically the format produced by xarray.Dataset.info
. It looks like this:
netcdf example { // example of CDL notation
dimensions:
lon = 3 ;
lat = 8 ;
variables:
float rh(lon, lat) ;
rh:units = "percent" ;
rh:long_name = "Relative humidity" ;
// global attributes
:title = "Simple example, lacks some conventions" ;
data:
/// optional ...ncgen will still build
rh =
2, 3, 5, 7, 11, 13, 17, 19,
23, 29, 31, 37, 41, 43, 47, 53,
59, 61, 67, 71, 73, 79, 83, 89 ;
}
I wrote a small pure python parser for CDL last night and it seems work! There are similar projects on github. Sadly, these projects seem to be abandoned so it would be nice to attach to an effort like xarray.
Describe alternatives you’ve considered
Some kind of schema
object that can be used to validate or generate an xarray Dataset, but does not contain any data.
Additional context
No response
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:7 (7 by maintainers)
To be fair,
ds.info
is not 100% CDL, but it’s darn close.@jhamman We have a similar schema package https://github.com/ai2cm/fv3net/tree/master/external/synth, cool to see you confronting the same challenges and advertising your solutions more broadly. One problem we had is that our schema objects ended up being quite verbose: https://github.com/ai2cm/fv3net/blob/master/external/loaders/tests/test__batch/one_step_zarr_schema.json. CDL is a lot more concise.