question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Should we change the output of `session.run`?

See original GitHub issue

Background

What’s the output of session.run()? Currently, this is not clear as you think and it isn’t documented anywhere. The logic is defined in runner.py, this can be counter-intuitive in some cases, is there a good reason why we want to do this?

https://github.com/kedro-org/kedro/blob/f4914201d7f6f38318c2c1f074fcdf802b3e1e0d/kedro/runner/runner.py#L78-L91

kedro has improved a lot in terms of how to run the pipeline with packaging & KedroSession as a standalone application, #1423 documents different ways to do it. Personally, I think it is still not easy enough to integrate with kedro for someone who is inexperienced with kedro. In #1423, It mentioned how a pipeline can be called programmatically. Even though the pipeline itself is a function call, it doesn’t behave like a function, i.e. you can’t really define an input as an argument easily (it has to be a Catalog entry), the output of the pipeline is also very restricted.

Motivation

Kedro works really well within the kedro world, but it also mean that kedro works very differently from the rest of the Python world.

This issue mainly focuses on the output side, this will improve the experience to integrate the kedro pipeline as an upstream. In a over-simplified world, this should be straight forward to do. Currently I think we a strong assumption that people work with “Kedro Project”, but if we are moving towards a kedro package, i.e. using from kedro_package import main, it should behave just like a Python function, I think this is a reasonable expectation.

1. df = get_some_data()
2. model = my_kedro_pipeline(input={'my_pipeline_input_df': df})
3. app = PredictionWebService(model)

Questions

  • What should be return with session.run?

Things to consider

  • How can any Python developer integrate with the kedro pipeline easily? Can it behave just like a function?
  • In an interactive workflow, it may make sense to keep all intermediate output in the resulting dict
  • Is there a known reason why the output is defined as it is?

Related Issue:

  • It would enable a better interactive workflow #1721
  • #1423 Is trying to improve/simplifies how we run kedro pipeline as a standalone package
  • #795 Discussion of how to use kedro as upstream/downstream

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
noklamcommented, Oct 6, 2022

I just give it a go to see what would it takes to make the initial idea works, partly because I want to test how the nbdev system works. See DebugRunner

https://noklam.github.io/kedro-debug-runner/core.html

1reaction
noklamcommented, Oct 5, 2022

Supplement on the above comments to address @AntonyMilneQB question:

i.e. we could have free_output = pipeline.outputs()

The answer to that is there is a catalog.load call at the end, it’s an expensive call and potentially memory hungry. So persisted datasets are deleted from memory as long as they are not needed. For MemoryDataSet, it’s loaded in memory already, so there is no harm to return it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

session.run() give different results for same input
I 'm loading a tensorflow Saved Model and try to infer with loaded model. It takes a gray image as input and outputs...
Read more >
Error to flat the output of the session->Run C++ #24377 - GitHub
When i try to do the flat, the system execution is interrupted, the error are: Check failed: dtype() == expected_dtype (9 vs. 1)....
Read more >
tf.compat.v1.Session | TensorFlow v2.11.0
Sets up a graph with feeds and fetches for partial run. This is EXPERIMENTAL and subject to change. Note that contrary to run...
Read more >
Understanding a TensorFlow program in simple steps.
TensorFlow is a library which can be applied to all the machine learning algorithms ... we give fetches and feed_dict pass into every...
Read more >
TensorFlow 2.0 session run - RoseIndia.Net
So, we have to change the tf.Session() to a Python function in TensorFlow 2.0. If you run the above program it will give...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found