Consistent nodes execution order with `SequentialRunner`
See original GitHub issueDescription
I’m always frustrated when I execute an identical pipeline it can yield different results. The root cause of this issue is due to the fact that multiple combinations of nodes exists. Kedro only try to solve the DAGs by finding 1 possible solution, but it is not guaranteed to be the same.
One workaround is to specify input/output of nodes to make sure there is only 1 possible solution, but this is not ideal as users has to maintain arbitrary dummy variables.
What is missing in the DAGs?
Seed of random number generator. Consider a simple pipeline with 3 nodes:
A-- \
\
C
/
/
B--
In this pipeline, there are 2 possible execution order with SequentialRunner, 1. A->B->C, 2. B->A->C. Although there are no strong preference whether 1/2 is better, it is better to stick with one of them, as the output can be changed.
Context
In data science/machine learning pipeline, setting a seed to ensure reproducible result are very common, and currently there are no easy way to achieve this.
Possible Implementation
Ensure the resolved nodes are sorted so it always run in the same order with SequentialRunner.
Possible Alternatives
(Optional) Describe any alternative solutions or features you’ve considered.
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (11 by maintainers)

Top Related StackOverflow Question
I’ll just add users do ask for this maybe once every two months, I’ve even seen people introduce fake nodes to force ordering.
So on this - we use the
toposort~=1.5external library, since Kedro was released there is now a stdlib https://docs.python.org/3/library/graphlib.html module that does the same thing. If we do look into making this deterministic, it might be a good opportunity to adopt this too.