[FR] Add environment restoration to `mlflow.pyfunc.load_model` via new `env_manager` parameter

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

Currently,mlflow.pyfunc.spark_udf accepts an env_manager parameter that enables users to load and score models in an isolated environment with their training time dependencies recovered. We would like to introduce this parameter / feature to the mlflow.pyfunc.load_model model flavor and improve model reproducibility for this popular way of loading models.

Motivation

What is the use case for this feature?

This feature will make model reproduction simpler. It will be easier to eliminate missing or version-mismatched dependencies for models loaded with mlflow.pyfunc.load_model. This would also improve the accuracy and usability of features like model evaluation, which rely on mlflow.pyfunc.load_model.

Why is this use case valuable to support for MLflow users in general?

MLflow users care about model reproducibility.

Why is it currently difficult to achieve this use case?

There are two options to reproduce a model right now.

Manually download the model requirements from the model artifacts and install them in the python script or notebook environment. This makes it difficult to load models in scripts that require different dependencies than the model itself.
Call mlflow.pyfunc.spark_udf. This requires setting up Spark (a large dependency, sophisticated to set up in local environments) and converting datasets to spark dataframes (inconvenient).

Details

Interface

Function signature: mlflow.pyfunc.load_model(model_uri: str, suppress_warnings: bool = False, dst_path: Optional[str] = None, env_manager: str ='local')

When env_manager='local', the current behavior of load_model is preserved. No environment restoration is performed and warmings are printed if there is a mismatch between the model’s dependencies and libraries installed in the Python script / notebook environment.
When a model is loaded with env_manager='virtualenv' or env_manager='conda', when predict is called, we follow spark_udf’s approach of preparing an separate python environment (using virtualenv or conda) with the model dependencies installed, starting a MLflow Model Server with the loaded model in that environment, and scoring the input data against it.

Implementation

Thanks to the mlflow.pyfunc.spark_udf implementation, there already exists functionality to

prepare an isolated environment installed with training time model dependencies
start a model server in that environment with the desired model
scoring datasets against the server

We can simply re-use this functionality in mlflow.pyfunc.load_model.

Complexities

When environment restoration for spark UDFs were originally implemented, we did not add the feature to load_model because its predict function accepts a wider set of input types, which are harder to use in the model server approach. We should look into how to build support for all possible predict function inputs.

What component(s) does this bug affect?

area/artifacts: Artifact stores and artifact logging
area/build: Build and test infrastructure for MLflow
area/docs: MLflow documentation pages
area/examples: Example code
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
area/models: MLmodel format, model serialization/deserialization, flavors
area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
area/projects: MLproject format, project running backends
area/scoring: MLflow Model server, model deployment tools, Spark UDFs
area/server-infra: MLflow Tracking server backend
area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

Issue Analytics

State:
Created a year ago
Reactions:2
Comments:8

Top GitHub Comments

1reaction

dbczumarcommented, Aug 12, 2022

Regarding (3), I don’t think there’s a cost savings by excluding conda; it’s just a matter of which environment manager string is passed to the MLflow Model Server startup script 😃

0reactions

jerrylian-dbcommented, Aug 19, 2022

Update: after discussion, we’ve decided to deprioritize adding env restoration to mlflow.pyfunc.load_model and instead will be working on building env restoration directly into the mlflow.evaluate API. To build env restoration into load_model we would need to figure out how to kill model servers started by the load_model on the death or exit of the parent Python process on all platforms, which would take work beyond our available capacity at this time. In mlflow.evaluate, we can simply kill the model server once model prediction is complete.

Top Results From Across the Web

MLflow Models — MLflow 2.0.1 documentation

This file contains the following information that's required to restore a model environment using virtualenv: Python version. Version specifiers for pip , ...

mlflow.pyfunc — MLflow 2.0.1 documentation

Dependencies are either stored directly with the model or referenced via a Conda environment. The mlflow.pyfunc module also defines utilities for creating ......

Source code for mlflow.pyfunc

For example, you may want to create an MLflow model with the ``pyfunc`` flavor using a framework that MLflow does not natively support....

Source code for mlflow.pyfunc.model

predict` when evaluating inputs. When loading an MLflow model with :func:`~load_model`, this method is called as soon as the :class:`~PythonModel` is ...

mlflow.models - Documentation

Provides APIs for implementing new Model flavors. add_flavor (name, **params)[source]. Add an entry for how to serve the model in a given ...