Kedro Tutorial pandas.CSVDataSet not found and kedro.extras ModuleNotFoundError
See original GitHub issueDescription
kedro ipython fails with
DataSetError: An exception occurred when parsing config for DataSet `companies`:
Class `pandas.CSVDataSet` not found.
if dataset referenced as pandas.CSVDataSet.
Context
I follow the tutorial until the “Setting up the data”-“Reference all datasets” step. I reference the two datasets as pandas.CSVDataSet and check that the datasets are correctly referenced by running context.catalog.load(“companies”).head() in a kedro ipython session.
Steps to Reproduce
- Create a new project and install dependencies as shown in the Kedro Spaceflights tutorial.
- Download the data files and reference the files as shown in the Setting up the data part:
`companies: type: pandas.CSVDataSet filepath: data/01_raw/companies.csv
reviews: type: pandas.CSVDataSet filepath: data/01_raw/reviews.csv`
- run kedro ipython
Expected Result
kedro ipython session should start and context.catalog.load(“companies”).head() should display the first rows of the dataset.
Actual Result
when I run kedro ipython I get:
DataSetError: An exception occurred when parsing config for DataSet `companies`:
Class `pandas.CSVDataSet` not found.
If dataset referenced as CSVLocalDataSet, then the kedro ipython session starts correctly and context.catalog.load(“companies”).head() displays the first rows of the dataset. However, if I then run from kedro.extras.datasets.pandas import CSVDataSet
, it fails with:
ModuleNotFoundError: No module named 'kedro.extras'
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
- Kedro version used (using
pip show kedro
): 0.15.5 - Python version used (
python -V
): Python 3.7.6 - Operating system and version: macOS High Sierra 10.13.6
Issue Analytics
- State:
- Created 4 years ago
- Comments:12
Top GitHub Comments
Hello @Mshindi777! 👋
Thank you for raising this issue. I’ve explained why this is happening in this PR comment here: https://github.com/quantumblacklabs/kedro/pull/222#issuecomment-586580697 and intend to fix it first thing Monday morning.
To view the correct documentation, at the bottom right of the sidebar on the left in the documentation, you should be able to switch the documentation version from
latest
tostable
.Hope that helps and sorry for the confusion re: docs!
Hi @pmbaumgartner,
We’re aware of this and it’s due to the way we handle our dependencies. As a stopgap for now,
pip install “kedro[all]”
should get you up and running.