[FR] Model registry semantic versioning

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

For registered model versions allow user-specific semantic versioning strings (i.e. v1.1.3) in addition to auto-incrementing integers

Motivation

What is the use case for this feature?

Providing greater flexibility to users for identifying a registered model

Why is this use case valuable to support for MLflow users in general?

Currently, it’s not possible to indicate how much a model has changed between versions because the version number increment is always the same. Semantic versioning would provide a much richer description of each model that makes it easier for engineers to recall what’s what without having to click through to the source run page. Also, semantic versions could integrate better with external projects (i.e. matching the git tag / release version of a Github repository, etc)

Why is this use case valuable to support for your project(s) or organization?

Our team could create conventions for the different components of the version number. For example

Major: new architecture or changes to input/output schema
Minor: newly trained weights (different optimizer, additional data, etc)
Patch: weight- and architecture-compatible graph optimizations (new framework, fused layers, etc)
- This happens frequently when deploying to edge devices because we may find small tweaks that help layers run more efficiently or enable access to accelerator chips

Why is it currently difficult to achieve this use case?

There’s no option to specify any version number other than the default, auto-incrementing integers

Details

How might this look in the UI?

Click “Register Model” button from the run page artifacts section
Select the model name
A default version number is populated, but the user can change it

What component(s) does this bug affect?

area/artifacts: Artifact stores and artifact logging
area/build: Build and test infrastructure for MLflow
area/docs: MLflow documentation pages
area/examples: Example code
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
area/models: MLmodel format, model serialization/deserialization, flavors
area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
area/projects: MLproject format, project running backends
area/scoring: MLflow Model server, model deployment tools, Spark UDFs
area/server-infra: MLflow Tracking server backend
area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

Issue Analytics

State:
Created 10 months ago
Reactions:4
Comments:5

Top GitHub Comments

2reactions

cristofpcommented, Nov 28, 2022

If I can add some suggestion to this FR:

In my company we are designing automatic training and deployment of fresh models in production. In this case we need to have versioning pattern that will let for distinguishing following changes in the model:

a) new version of model breaks backward compatibility (inference method new signature is not backward compatible)
b) new version of model is backward compatible, but there was some code change in ML Project
c) new version of model is produced using the same ML Project code, but with differed (newer) training data set

This distinction is fully coherent with Semantic Versioning. And accordingly using major.minor.patch pattern we will:

a) bump major number when breaking backward compatibility
b) bump minor number when preserving backward compatibility but introducing code changes in ML Project
c) bump patch number when just retraining model using the same ML Project code

2reactions

mlflow-automationcommented, Nov 16, 2022

@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.

Top Results From Across the Web

Semantic Versioning for Artificial Intelligence (AI) 1.0.0

I will propose in this post an approach for semantic versioning of AI models. This approach is itself versioned and starts with 1.0.0....

Register and Deploy Models with Model Registry

With the SageMaker model registry you can catalog models for production, manage model versions, associate metadata, and manage the approval status of a ......

Model versioning with Vertex AI Model Registry - Google Cloud

With Vertex AI Model Registry you can view your models and all of their versions in a single view. You can drill down...

YANG Semantic Versioning draft-ietf-netmod-yang-semver-08

Semantic Versioning Scheme for YANG Artifacts . ... leaf "baz" (BC) 2.0.0 - change existing model for performance reasons, e.g. re-key list ...

Version Control Guide For Machine Learning Researchers

While pushing a new version of a machine learning model, ... use Neptune for experiment tracking and model registry to control their ...