question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Include yaml examples in dataset docs

See original GitHub issue

Description

The most common way I look up the docs for a DataSet is to google search for things like kedro csv, which lands me in the kedro.extras.datasets.pandas.CSVDataSet docs. This is great to see the api, but it is a bit confusing that the suggested catalog method is to use yaml, but the docs are in python.

Search for the docs

image

Current page

Currently the docs look like this, and do not include good examples for creating real catalog entries with the dataset.

image

But the suggested way to add datasets to the catalog is with the yaml api, which looks like this.

image

Context

Aligning the preferred/suggested method of creating catalogs with likely entrypoints into the docs would encourage users to use that method and have less confusion for those who aren’t quite sure of the difference between the python api and yaml api.

Possible Implementation

Include yaml examples in the DataSet docstrings, with backlinks to how to implement the catalog in anyway that is documented. If the python api is left in as an example there should be a link to show how to implement that example into the project.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:14 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
noklamcommented, Aug 5, 2022

Closing this since the CSVDataSet example is in place. #1762 will be the new issue to track the progress of enhancing DataSet API Docs

CC @levimjoseph

2reactions
yetudadacommented, Oct 22, 2020

I’ve always wondered about this too. I think it would be great to see YAML equivalent examples for the datasets. I’ll create hacktoberfest tickets for this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using YAML for data science - Paperspace Docs
Using YAML for data science. This page introduces YAML syntax used in Gradient Workflows. Overview​. YAML provides a powerful and precise configuration for ......
Read more >
GitLab CI/CD include examples
The following example shows an include file that is customized in the .gitlab-ci.yml file. Specific YAML-defined variables and details of the production job...
Read more >
kedro.extras.datasets.yaml.YAMLDataSet - Read the Docs
YAMLDataSet loads/saves data from/to a YAML file using an underlying filesystem (e.g.: local, S3, GCS). It uses PyYAML to handle the YAML file....
Read more >
Create a YAML File - Product Documentation
Include the following information in a YAML file. ... Array of arrays (a list of the object that includes database/table/column name).
Read more >
app.yaml configuration file - Google Cloud
Notice: Over the next few months, we're reorganizing the App Engine documentation site to make it easier to find content and better align...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found