Modular Pipelines
See original GitHub issueDescription
We’ve seen something incredible evolve through continued use of Kedro. Teams around the world are starting to use Kedro to create stores of reusable pipelines.
Last year, we introduced basic support for Modular Pipelines and this year we’re doubling down on this area.
In our world, a modular pipeline is a series of generalised and connected Python functions that have inputs and outputs. A modular pipeline:
- Can be easily added to an existing or new Kedro project
- Has virtually no learning curve, if you know how to use Kedro
- Can be testable by itself, to ensure high quality code
- Does not have a Kedro version dependency (related to #219)
Context
The final evolution of Modular Pipelines will see an ecosystem of reusable pipelines. However, for now we want to focus on allowing users to easily add pre-assembled pipelines to an existing or new Kedro project and export their own pre-assembled pipelines.
Next steps
Give us feedback if you’ve tried Modular Pipelines and the basic support we have for using them, like pipeline.transform()
. Modular Pipelines also have implications for kedro-viz
and we can’t wait to show you what we have in mind for this.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:9 (5 by maintainers)
Top GitHub Comments
Also as an update (for whoever is interested), we’re looking to include this feature in the next breaking release (0.16.0). We’ve merged
pipeline()
helper here, a slightly cleaner alternative toPipeline.transform()
, which we’re dropping, to map inputs/outputs/parameters names, or namespace (prefix) datasets and node names. There’s work being done on the CLI side to help with the workflow of creating/working with modular pipelines. This includes generating a new pipeline, packaging an existing pipeline, and pulling an existing pipeline from somewhere, integrating it into a Kedro project.@yetudada Not sure if this is where you’d like the feedback, but this is essentially how we’ve been building all our pipelines. One of the sticking points I’ve found is how to write tests that ensure the pipelines work within a kedro context. What I’ve resorted to doing is writing tests that create temporary kedro projects, then test the pipelines within them.