[FR] learning curve visualization
See original GitHub issueWillingness to contribute
No. I cannot contribute this feature at this time.
Proposal Summary
Add a learning curve visualization to the Run Comparison view. There are a few different definitions of “learning curve”, but the one I take is a a graph where step number is the X axis and a metric of interest is the Y axis. Multiple training jobs can be plotted as lines on this graph. Here’s an example from a different tool:
Motivation
What is the use case for this feature?
A data scientist wants to understand the impact of hyperparameter choices on model convergence.
Why is this use case valuable to support for MLflow users in general?
When comparing runs, charting metrics over time for each run is useful for understanding the impact of hyperparameters and informing model selection.
Why is this use case valuable to support for your project(s) or organization?
See above.
Why is it currently difficult to achieve this use case?
MLFlow (as of version 1.30.0) doesn’t implement a learning curve visualization.
Details
I think this feature would only require touching the frontend code, since the data structure for metrics tracking already exists.
What component(s) does this bug affect?
-
area/artifacts
: Artifact stores and artifact logging -
area/build
: Build and test infrastructure for MLflow -
area/docs
: MLflow documentation pages -
area/examples
: Example code -
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models
: MLmodel format, model serialization/deserialization, flavors -
area/pipelines
: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates -
area/projects
: MLproject format, project running backends -
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra
: MLflow Tracking server backend -
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
-
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker
: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows
: Windows support
What language(s) does this bug affect?
-
language/r
: R APIs and clients -
language/java
: Java APIs and clients -
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
-
integrations/azure
: Azure and Azure ML integrations -
integrations/sagemaker
: SageMaker integrations -
integrations/databricks
: Databricks integrations
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:8 (3 by maintainers)
@schmidt-jake I’m glad that I guessed what you were talking about. Thank you for confirming. We’ll take a look at it internally and discuss. Thank you for the great idea!
I believe so, because they are logged as metrics and each metric is logged with its timestamp.