question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Experiment Tracking in Kedro

See original GitHub issue

Why should we care about Experiment Tracking?

Experiment tracking is a way to record all information that you would need to recreate a data science experiment. We think of it as logging for parameters, metrics, models and other artefacts.

Kedro currently has parts of this functionality. For example, it’s possible to log parameters as part of your codebase and snapshot models and other artefacts like plots with Kedro’s versioning capabilities for datasets. However, Kedro is missing a way to log metrics and capture all this logged metadata as a timestamped run of an experiment. It is also missing a way for users to visualise, discover and compare this logged metadata.

This change is essential to us because we want to standardise how logging for ML is done. There should be one easy way to capture this information, and we’re going to give users the Kedro way to do this.

This functionality is also expected to increase Kedro Lab usage by Data Scientists as it has anecdotally been known that people performing the Data Engineering workflow get the most benefits from Kedro-Viz while the Data Science workflow is not accounted for.

What evidence do we have to suggest that we do this?

Our users sense the gap, and one of the most common usage patterns of Kedro is with MLFlow Tracking, which provides this additional functionality. We have seen evidence here:

We also know that our internal users relied on PerformanceAI for this functionality. We sunset PerformanceAI, but PerformanceAI was fantastic to use because:

  • It allowed multiple collaborators to share results
  • It integrated nicely with Kedro
  • The UI was great

Our vertical teams, namely C1 (@deepyaman), InsureX (@imdoroshenko @benhorsburgh) and OptimusAI (@mkretsch327) consider this high priority and will be confirmed users of this functionality.

What metrics will we track to prove the success of this feature?

  • kedro viz terminal runs
  • A metric that points to the use of this feature
  • Full adoption of the feature by all vertical teams

What design requirements do we have?

We must allow users to:

  • Keep track of their metrics
  • See the concept of an experiment on Kedro Lab

We must think about:

  • Minimising the number of changes a user would need to make to activate this project from a current Kedro project
  • How users would share their experiment results with other team members
  • How this functionality would work with other MLFlow tools (e.g. model serving)
  • How users would disable runs so that they don’t clutter run history
  • How this functionality works with the KedroSession

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:2
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
noklamcommented, Jun 22, 2022

Experiment tracking is a way to record all information that you would need to recreate a data science experiment. We think of it as logging for parameters, metrics, models and other artifacts.

Some thoughts after today’s tech design session.

The general statement above gives me an impression that Kedro is offering some “MLOps” capabilities.

I tried to group the experiment tracking features into 2 different categories:

  1. Metrics Tracking / Comparison - More about how to visualize stuff on UI and help DS to do their work.
  2. Reproducible experiment - Artefacts/Code/Environment to fully reproduce an experiment

I think the main focus of this GH issue is on point 1 and I see a lot of consideration with mlflow, but I argue mlflow isn’t the best reference for this space. There are many more offered by tools like wandb, neptune, or clearml. This article concluded them quite well as Dashboard as Operating System

So my question is, how much do we expect Kedro plays in this space and how far do we want to go? Or what are the things that we are not going to do for experiment tracking ? (Like kedro is not going to do any orchestration work) @yetudada @NeroOkwa CC: @AntonyMilneQB

0reactions
merelchtcommented, Nov 25, 2021

(Comment copied over, originally written by @limdauto)

@AntonyMilneQB thanks for the amazing comments as always!

Re: General Concept

100% agree that we don’t need experiment as an abstraction. I wrote “we can” but I also don’t think “we should” do it. I’d be interested to see if any user has any legitimate use case after trying our workflow. It’s nice to have an escape hatch in the design.

Re: Milestone 1

How to mark which datasets are tracked on kedro viz

Yea actually this is a great point. Let me bring it up with @GabrielComymQB tomorrow. We can do something similar to the parameters.

Metrics Plot

  • Re x-axis: definitely uniformedly spaced.
  • Re y-axis with different scales: I was thinking we could do this but rescaling on single axis works too!

Re: Milestone 2

Session Type

I think I’m specifically discussing data type here when we represent the session in the viz database. For experimentation tracking purpose, we only care about run vs non-run session, so I’m thinking to just set other session to null for now, including CLI session. For CLI, I don’t know how granular we want to be, e.g. do we want to split cli and jupyter even though we launch jupyter through the CLI?

Scalability of querying by metrics

This touches on a design iteration that I haven’t mentioned. If we want to query by metrics, we need a metrics-friendly search index. At the very least, we need to setup an index in sqlite to do it: https://www.tutorialspoint.com/sqlite/sqlite_indexes.htm – but there are other solution, including an in-memory search index where we pay the cost up front when starting viz or we can even us full-blown disk-based search index too: https://whoosh.readthedocs.io/en/latest/index.html. There are pros & cons for each approach. I will write a separate design doc just for the metrics query. But it will be for later iteration.

Scalability of many runs

Since this was still being (visually) designed when I wrote the tech design, I didn’t put it in. But I absolutely agree with you that the ability to find runs in a long list is essential. In the first iteration, from a product point of view, our solution is:

  • Allowing user to favourite, rename and add a note to a run from the viz UI.
  • Finding runs for the first iteration will be done purely with text search.
  • For later iterations, we will add more to the text search box with structured query, e.g. accuracy>=0.8

In terms of technical performance, I’m still considering the pros and cons of whether to perform the search client-side or server-side. But I know for a fact we can do text search client side up to thousands of rows easily. For millions of rows, you can employ an embedded in-memory search index like this one to help: https://github.com/techfort/LokiJS. I’m still debating though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Experiment tracking — Kedro 0.18.4 documentation
Experiment tracking is a way to record all information that you would need to recreate and analyse a data science experiment. We think...
Read more >
Set up experiment tracking — Kedro 0.18.4 documentation
Experiment tracking is the process of saving all machine-learning related experiment information so that it is easy to find and compare past runs....
Read more >
Experiment tracking — Kedro 0.17.6 documentation
Experiment tracking is a way to record all information that you would need to recreate and analyse a data science experiment. We think...
Read more >
Experiment tracking — Kedro 0.18.0 documentation
Experiment tracking in Kedro adds in the missing pieces and will be developed incrementally. The following section outlines the setup within your Kedro...
Read more >
Set up experiment tracking — Kedro 0.17.7 documentation
Experiment tracking is the process of saving all machine-learning related experiment information so that it is easy to find and compare past runs....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found