question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Create a MlflowModelDataSet

See original GitHub issue

The current version of kedro-mlflow only enable to log a a full pipeline as a mlflow model through the KedroPipelineModel class. It would be useful to create an AbstractDataSet class in order to enable model logging within the ``catalog.yml`.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
akruszewskicommented, Sep 14, 2020

I totally agree with @Galileo-Galilei and your opinion @kaemo that the first option is a better solution. As @Galileo-Galilei mentions, it would be better to pass constructor arguments in a separate argument (init_args sounds reasonable).

1reaction
Galileo-Galileicommented, Sep 9, 2020

It needs some deeper thoughts to see if something better can come out, but the second solution is a no go for me (even if I understand that it should work):

  • Doing I/O operations (automagically, which is even worse) in kedro nodes completely break kedro’s principles which is one of key idea of using this plugin.
  • it will make debugging very difficult
  • it is much more likely to be impacted by future kedro changes and to break some of the kedro features (namespaces, layers…)

On the other hand, the first solution seems reasonable (after all, it is what kedro itself does to instantiate the dataset). If arguments are needed, a custom init_args (or whatever the name is) should be used rather than passing the full string (my_package.mlflow_models.MyPyfuncModel(arg1="values") is not only ugly but breaks parameter versioning). I’ll try to dive deeper to see advantages and shortcomings.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MLflow Models — MLflow 2.1.0 documentation
An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example,...
Read more >
ML End-to-End Example(Python)
Register the best performing model in MLflow; Apply the registered model to another dataset using a Spark UDF; Set up model serving for...
Read more >
MLflow Model Registry example - Azure Databricks
In this article · Load dataset, train model, and track with MLflow Tracking · Register and manage the model using the MLflow UI...
Read more >
Machine learning model serving for newbies with MLflow
The clf-train.py script uses the sklearn breast cancer dataset, trains a simple random forest classifier, and saves the model to local disk with ......
Read more >
Datasets — kedro-mlflow 0.11.5 documentation
Models DataSet · flavor (str) – Built-in or custom MLflow model flavor module. · run_id (Optional[str], optional) – MLflow run ID to use...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found