[FR] Model registry semantic versioning
See original GitHub issueWillingness to contribute
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
Proposal Summary
For registered model versions allow user-specific semantic versioning strings (i.e. v1.1.3
) in addition to auto-incrementing integers
Motivation
What is the use case for this feature?
Providing greater flexibility to users for identifying a registered model
Why is this use case valuable to support for MLflow users in general?
Currently, it’s not possible to indicate how much a model has changed between versions because the version number increment is always the same. Semantic versioning would provide a much richer description of each model that makes it easier for engineers to recall what’s what without having to click through to the source run page. Also, semantic versions could integrate better with external projects (i.e. matching the git tag / release version of a Github repository, etc)
Why is this use case valuable to support for your project(s) or organization?
Our team could create conventions for the different components of the version number. For example
- Major: new architecture or changes to input/output schema
- Minor: newly trained weights (different optimizer, additional data, etc)
- Patch: weight- and architecture-compatible graph optimizations (new framework, fused layers, etc)
- This happens frequently when deploying to edge devices because we may find small tweaks that help layers run more efficiently or enable access to accelerator chips
Why is it currently difficult to achieve this use case?
There’s no option to specify any version number other than the default, auto-incrementing integers
Details
How might this look in the UI?
- Click “Register Model” button from the run page artifacts section
- Select the model name
- A default version number is populated, but the user can change it
What component(s) does this bug affect?
-
area/artifacts
: Artifact stores and artifact logging -
area/build
: Build and test infrastructure for MLflow -
area/docs
: MLflow documentation pages -
area/examples
: Example code -
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models
: MLmodel format, model serialization/deserialization, flavors -
area/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templates -
area/projects
: MLproject format, project running backends -
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra
: MLflow Tracking server backend -
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
-
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker
: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows
: Windows support
What language(s) does this bug affect?
-
language/r
: R APIs and clients -
language/java
: Java APIs and clients -
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
-
integrations/azure
: Azure and Azure ML integrations -
integrations/sagemaker
: SageMaker integrations -
integrations/databricks
: Databricks integrations
Issue Analytics
- State:
- Created 10 months ago
- Reactions:4
- Comments:5
If I can add some suggestion to this FR:
In my company we are designing automatic training and deployment of fresh models in production. In this case we need to have versioning pattern that will let for distinguishing following changes in the model:
This distinction is fully coherent with Semantic Versioning. And accordingly using
major.minor.patch
pattern we will:major
number when breaking backward compatibilityminor
number when preserving backward compatibility but introducing code changes in ML Projectpatch
number when just retraining model using the same ML Project code@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.