question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FR] learning curve visualization

See original GitHub issue

Willingness to contribute

No. I cannot contribute this feature at this time.

Proposal Summary

Add a learning curve visualization to the Run Comparison view. There are a few different definitions of “learning curve”, but the one I take is a a graph where step number is the X axis and a metric of interest is the Y axis. Multiple training jobs can be plotted as lines on this graph. Here’s an example from a different tool: Screen Shot 2022-10-28 at 4 07 16 PM

Motivation

What is the use case for this feature?

A data scientist wants to understand the impact of hyperparameter choices on model convergence.

Why is this use case valuable to support for MLflow users in general?

When comparing runs, charting metrics over time for each run is useful for understanding the impact of hyperparameters and informing model selection.

Why is this use case valuable to support for your project(s) or organization?

See above.

Why is it currently difficult to achieve this use case?

MLFlow (as of version 1.30.0) doesn’t implement a learning curve visualization.

Details

I think this feature would only require touching the frontend code, since the data structure for metrics tracking already exists.

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:2
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
BenWilson2commented, Oct 31, 2022

@schmidt-jake I’m glad that I guessed what you were talking about. Thank you for confirming. We’ll take a look at it internally and discuss. Thank you for the great idea!

0reactions
schmidt-jakecommented, Nov 7, 2022

Each of those epoch writes to the tracking server have a discrete timestamp component to them?

I believe so, because they are logged as metrics and each metric is logged with its timestamp.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Plotting Learning Curves and Checking Models' Scalability
Here, we compute the learning curve of a naive Bayes classifier and a SVM ... We see another typical learning curve for the...
Read more >
Tutorial: Learning Curves for Machine Learning in Python
In a nutshell, a learning curve shows how error changes as the training set size increases. The diagram below should help you visualize...
Read more >
Learning Curve — Yellowbrick v1.5 documentation
The visualizer evaluates cross-validated training and test scores for different training set sizes. These curves are plotted so that the x-axis is the...
Read more >
Plot a learning Curve in Python - ProjectPro
Imports Learning curve function for visualization 3. Splits dataset into train and test 4. Plots graphs using matplotlib to analyze the ...
Read more >
Plotting the Learning Curve with a Single Line of Code
It is a visualization technique that can be to see how much our model benefits from adding more training data. It shows the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found