Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Create a MlflowModelDataSet

See original GitHub issue

The current version of kedro-mlflow only enable to log a a full pipeline as a mlflow model through the KedroPipelineModel class. It would be useful to create an AbstractDataSet class in order to enable model logging within the ``catalog.yml`.

Issue Analytics

State:
Created 3 years ago
Comments:8 (7 by maintainers)

Top GitHub Comments

1reaction

akruszewskicommented, Sep 14, 2020

I totally agree with @Galileo-Galilei and your opinion @kaemo that the first option is a better solution. As @Galileo-Galilei mentions, it would be better to pass constructor arguments in a separate argument (init_args sounds reasonable).

1reaction

Galileo-Galileicommented, Sep 9, 2020

It needs some deeper thoughts to see if something better can come out, but the second solution is a no go for me (even if I understand that it should work):

Doing I/O operations (automagically, which is even worse) in kedro nodes completely break kedro’s principles which is one of key idea of using this plugin.
it will make debugging very difficult
it is much more likely to be impacted by future kedro changes and to break some of the kedro features (namespaces, layers…)

On the other hand, the first solution seems reasonable (after all, it is what kedro itself does to instantiate the dataset). If arguments are needed, a custom init_args (or whatever the name is) should be used rather than passing the full string (my_package.mlflow_models.MyPyfuncModel(arg1="values") is not only ugly but breaks parameter versioning). I’ll try to dive deeper to see advantages and shortcomings.

Top Results From Across the Web

MLflow Models — MLflow 2.1.0 documentation

An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example,...

ML End-to-End Example(Python)

Register the best performing model in MLflow; Apply the registered model to another dataset using a Spark UDF; Set up model serving for...

MLflow Model Registry example - Azure Databricks

In this article · Load dataset, train model, and track with MLflow Tracking · Register and manage the model using the MLflow UI...

Machine learning model serving for newbies with MLflow

The clf-train.py script uses the sklearn breast cancer dataset, trains a simple random forest classifier, and saves the model to local disk with ......

Datasets — kedro-mlflow 0.11.5 documentation

Models DataSet · flavor (str) – Built-in or custom MLflow model flavor module. · run_id (Optional[str], optional) – MLflow run ID to use...