[FR] Improve performance by lowering amount of calls to retrieve model

See original GitHub issue

Thank you for submitting a feature request. Before proceeding, please review MLflow’s Issue Policy for feature requests and the MLflow Contributing Guide.

Please fill in this feature request template to ensure a timely and thorough response.

Willingness to contribute

The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?

Yes. I can contribute this feature independently.
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
No. I cannot contribute this feature at this time.

Proposal Summary

Retrieve models more efficiently by lowering required amount of requests.

Currently to retrieve a model we have to do 3 requests: experiment_name=“energy_forecast_10001_Amsterdam” experiment = mlflow.get_experiment_by_name(experiment_name) run = mlflow.search_runs(experiment.experiment_id, max_results=1) model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], “model/”))

It would be nice if this can be speeded up by getting model in only 1 request: model = mlflow.sklearn.load_latest_model(experiment_name)

or 2 requests: run = mlflow.search_runs(experiment_name, max_results=1) model = mlflow.sklearn.load_model(os.path.join(run.artifact_uri[0], “model/”))

Motivation

What is the use case for this feature? Performance
Why is this use case valuable to support for MLflow users in general? Performance for all users to load models.
Why is this use case valuable to support for your project(s) or organization? Performance.
Why is it currently difficult to achieve this use case? (please be as specific as possible about why related MLflow features and components are insufficient) It’s more difficult/impossible to improve the performance at higher level when lower calls are not performant.

What component(s), interfaces, languages, and integrations does this feature affect?

Components

area/artifacts: Artifact stores and artifact logging
area/build: Build and test infrastructure for MLflow
area/docs: MLflow documentation pages
area/examples: Example code
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
area/models: MLmodel format, model serialization/deserialization, flavors
area/projects: MLproject format, project running backends
area/scoring: MLflow Model server, model deployment tools, Spark UDFs
area/server-infra: MLflow Tracking server backend
area/tracking: Tracking Service, tracking client APIs, autologging

Interfaces

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Languages

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

Details

(Use this section to include any additional information about the feature. If you have a proposal for how to implement this feature, please include it here. For implementation guidelines, please refer to the Contributing Guide.)

Issue Analytics

State:
Created 2 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

r3stl355commented, Mar 30, 2022

I’ll give this a try

0reactions

Davidswinkelscommented, Apr 12, 2022

This issue was resolved by this PR (https://github.com/mlflow/mlflow/pull/5564) and mlflow 1.25.0 release. Did a small test on MLFlow==1.25.0 with a SQLite database. Performance did improve! It varied quite a bit compared to before. Probably due too environment (local vs kubernetes cluster, and file-based vs SQLite) and also how many runs/models were stored.

Summary performance check model retrieval per code chunk

| Average over 10 calls – | – Tracking registry: model via name + experiment + run | 1.48 s Tracking registry: model via name + run | 1.46 s Model registry: model via version + model registry | 1.62 s Model registry: multiple models via stage None + model registry | 1.46 s Model registry: single model via stage Production+ model registry | 1.49 s

</body> </html>

Tracking registry model retrieval

Retrieve model via name + experiment + run (1.48 s ± 44.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

experiment_name="Blub5"
experiment = mlflow.get_experiment_by_name(experiment_name)
run = mlflow.search_runs(experiment.experiment_id, max_results=1)
model = mlflow.pyfunc.load_model(os.path.join(run.artifact_uri[0], "model/"))

Retrieve model via name + run (1.46 s ± 28.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

experiment_name="Blub5"
run = mlflow.search_runs(experiment_names=[experiment_name], max_results=1)
model = mlflow.pyfunc.load_model(os.path.join(run.artifact_uri[0], "model/"))

Model registry model retrieval

Retrieve model via version + model registry (1.62 s ± 171 ms per loop (mean ± std. dev. of 7 runs, 10 loops each))

model_name = "Blub5"
client = MlflowClient()
model_versions = client.get_latest_versions(model_name, stages=["None"])
model_version = model_versions[0].version
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{model_version}")

Retrieve model via stage None + model registry (1.46 s ± 43.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)) Ten models are on stage None, but most recently trained model will be retrieved

model_name = "Blub5"
stage = 'None'
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{stage}")

Retrieve model via stage Production + model registry (1.49 s ± 66.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)) One single model is on Production

model_name = "Blub5"
stage = 'Production'
model = mlflow.pyfunc.load_model(model_uri=f"models:/{model_name}/{stage}")

Thanks restless for implementing this. More neat to be able to get run based on experiment_name directly from tracking registry 😃

Top Results From Across the Web

Data reduction techniques for Import modeling - Power BI

This article targets Power BI Desktop data modelers developing Import models. It describes different techniques to help reduce the data ...

Improve the performance of Mapbox GL JS maps | Help

When looking for opportunities to improve the performance of your map, look for ways in which you can reduce the number of layers,...

Prison Reform: Reducing Recidivism by Strengthening the ...

The Federal Bureau of Prisons is undertaking sweeping reforms designed to reduce recidivism and strengthen public safety. By focusing on evidence-based ...

Tuning - Spark 3.3.1 Documentation - Apache Spark

This guide will cover two main topics: data serialization, which is crucial for good network performance and can also reduce memory use, and...

5 Best Practices For Improving MongoDB Performance

Performance issues may indicate that the database is operating at capacity ... way for beginners to get started with schema design and document...