Define metadata.yaml schema
See original GitHub issueHere’s what I’m thinking for the metadata.yaml schema. We can set up CI to validate this schema (potentially with jsonschema?)
Then a README of summary statistics/csv files can be automatically generated (which will allow for easy querying such as this).
hashid: # required, hash id of the dataset
dataset: # required, dataset name
description: # required, dataset description
source: # required, link to the source from where dataset was retrieved
publication: # optional, study that generated the dataset
task: # required, classification or regression
columns:
[column_name]: # can be 'target'
type: # required, either continuous, nominal or ordinal
description: # required, what the column measures/indicates, unit
code: # optional, coding information, e.g., 'Control' = 0, 'Case' = 1
transform: # optional, any transformation performed on the column, e.g., log scaled
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (2 by maintainers)
Top Results From Across the Web
Doc-as-Code: Metadata Format Specification | DocFX website
You can use any file format that can represent structural data to store metadata. However, we recommend using YAML or JSON. In this...
Read more >Define Metadata Schema
Metadata schema is a set of user-defined attributes that extend the system-defined normalized data. The normalized data is generated from the source document...
Read more >Metadata Format Reference | Hasura GraphQL Docs
Example: A public_author.yaml table metadata file specifying some of the above. table: name: author schema: public array_relationships: - name: articles
Read more >Configuration schema/metadata
Properties; Types supported in metadata files ... The config_object type is defined in core.data_types.schema.yml as follows:
Read more >Defining metadata (meta.yaml)
A schema-free area for storing non-conda-specific metadata in standard YAML form. EXAMPLE: To store recipe maintainer information: extra: maintainers: ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@trang1618 maybe you can open new issues about renaming datasets and renaming feature names
I added first example metadata.yaml file but two things I would like to change:
_
instead of.
red_wine_quality
instead ofred-wine-quality
.If we were going to make this an example, I think we should make the changes on the dataset first before moving on.