Versioned datasets don't work with Prefect
See original GitHub issueDescription
Versioned datasets used as outputs causes node to fail when scheduling runs with Prefect.
Context
I’m trying to run a Kedro pipeline in Prefect. Because some of the output datasets used by nodes in the pipeline are versioned, running the node fails.
Steps to Reproduce
- Follow the Prefect deployment guide to set up Prefect and register the pipeline with Prefect. As a side note, the script on the current page doesn’t work for me and I had to update one of the imports.
- Running the pipeline in Prefect works once. Running it a second time or scheduling runs fails with a DataSetError.
Expected Result
Pipeline should be able to run when triggered after the first time.
Actual Result
Execution fails with the error.
kedro.io.core.DataSetError: Save path `/a/b/c/2022-03-29T14.20.46.529Z/xyz`
for ParquetDataSet(filepath=/a/b/c/xyz, load_args={}, protocol=file, save_args={},
version=Version(load=None, save='2022-03-29T14.20.46.529Z'))
must not exist if versioning is enabled.
It seems the Dataset is reusing the timestamp from when the register_prefect_flow.py is executed instead of the actual run time of the pipeline. In this case I ran the registration script at 14:20 and triggered the pipeline to be run at 14:23 but the timestamp in the error message above corresponds to the script run time and not trigger time.
I looked around in Kedro code a bit and it seems the function generating the timestamp is cached, but not sure if that’s all there is to it.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
- Kedro version used (
pip show kedroorkedro -V): 0.17.7 - Python version used (
python -V): 3.8.12 - Operating system and version: MacOS 11.6.4
- Prefect version: 1.1.0
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:6 (6 by maintainers)

Top Related StackOverflow Question
This might still be relevant. I think the pr only updates the documentation to latest prefect version and doesn’t fix the issue on versioned datasets.
@ardoi and @avan-sh there’s now a PR open that addresses this issue: https://github.com/kedro-org/kedro/pull/1775 would be awesome to get your review on that!