question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Technical design doc for `KedroSession`

See original GitHub issue

The KedroSession

The KedroSession is the object responsible for managing the lifecycle of a Kedro run. It has two main functions:

  1. Run execution: It makes sure that all core components needed by Kedro to execute a run are instantiated and the run is executed properly
  2. Persisting run data: KedroSession offers a way to persist run data through the session store. The following data gets saved in the session store:
  • package_name
  • project_path
  • session_id
  • CLI info: command run, run parameters
  • Git info: git sha, git branch, is branch dirty or not

Usage within Kedro 🏗

The KedroSession is a relatively new component within Kedro and at the time of writing, is mainly used to manage run lifecycles and for experiment tracking. The experiment tracking feature makes use of a session store implementation called the SQLiteStore, which uses SQLite to persist data. Other implementations of the session store available in Kedro are:

  1. BaseSessionStore: the base class for all session stores that doesn’t persist any data
  2. ShelveStore: implementation that uses the shelve package to persist data

Relation of a run and a session 🧑‍🤝‍🧑

While working on https://github.com/kedro-org/kedro/issues/1273 it was decided that Kedro session and Kedro run have a 1-1 mapping. This means that when a session gets created it will only ever be possible to kick off one full pipeline run during that specific session’s existence. In practice, Kedro manages this for you under the hood when kedro run is executed.

FAQ ❓

How does a Kedro user use KedroSession? As a Kedro user you don’t need to access the session directly. When you execute the kedro run command, a new session gets created automatically. This session will then kick off the pipeline run and when that process finishes, the session will be closed again persisting any run data if the project is configured with a persistent session store.

What about using KedroSession in an interactive workflow? When using jupyter or ipython you can access the active session object or create a new one. You can then retrieve the session_id, the run data that will be stored, load the context, and execute a run. However, we do not encourage users to use the session other than for checking the session_id and run data.

Related Github issues and PRs:

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:3
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
lorenabalancommented, Mar 10, 2022

@AntonyMilneQB I think not being able to run anything in the jupyter notebook / ipython takes away a lot from jupyter users we’re trying to convert to Python and Kedro. If we do that we need to seriously consider the consequences and clearly draw the boundaries of our target audience, because it sounds like they would be very different.

2reactions
noklamcommented, Mar 9, 2022

@AntonyMilneQB For me, it’s the ability to do checkpoint debugging in an interactive environment that matters. It may be I am not doing it in a right way, but I am interested in how others are using the Kedro Ipython/notebook other than EDA.

Just to recap, this is the workflow that I adopted in the past for development.

  1. Run a partial pipeline and stop at the point of interest.
  2. Do whatever I needed in a notebook environment. i.e. Changing the definition of a node / injecting / overwriting some of the data in catalog.
  3. Continue to run the pipeline until I get my desired output.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Writing Technical Design Docs. Engineering Insights | by Talin
An important skill for any software engineer is writing technical design docs (TDDs), also referred to as engineering design docs (EDDs).
Read more >
KedroSession not working on packaged kedro project #1869
I have made this issue and "Technical Design" ticket and hopefully we have some time to discuss this next week.
Read more >
How to write technical design docs - DEV Community ‍ ‍
Technical design documents (aka tech design docs or tech specs) are a great way of creating detailed game plans for features or solutions...
Read more >
Better Tech Specs: Technical Design Document - Range
The purpose of a technical design document is to aid in the critical analysis of a problem and the proposed solution, while also...
Read more >
Technical Documentation in Software Development - AltexSoft
It includes requirements documents, design decisions, architecture descriptions, program source code, and FAQs. User documentation covers ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found