Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot find the saved native string model though SUCCSESS files are created

See original GitHub issue

Describe the bug I trained a mmlspark.lightgbm regressor which works greats and generates outputs on the test set. When saving it in a local directory on an azure/k8s pod using the following command:

reg_fitted.saveNativeModel(‘./savemodel/model02/’,overwrite=True)

I do not get the txt file containing the model parameters. I can see the content of that directory which has the _SUCCESS file, but cannot find the txt file that contains the model itself:

ls savemodel/model02/ -la

total 12 drwxr-xr-x 2 root root 4096 Jan 6 22:37 . drwxr-xr-x 4 root root 4096 Jan 6 22:37 … -rw-r–r-- 1 root root 8 Jan 6 22:37 ._SUCCESS.crc -rw-r–r-- 1 root root 0 Jan 6 22:37 _SUCCESS

To Reproduce Not sure how you can exactly reproduce this issue since no error is generated but nothing is saved either.

Expected behavior I expected to see a txt file in the same directory that describes the model. I have done it before with success on azure/databricks cluster, but it seems weird on an azure/k8s deployment.

Info (please complete the following information):

MMLSpark Version: [e.g. v0.18.1]
Spark Version [e.g. 2.4.4]
Spark Platform [e.g. Azure/k8s]

** Stacktrace**

Please post the stacktrace here if applicable

If the bug pertains to a specific feature please tag the appropriate CODEOWNER for better visibility

Additional context Add any other context about the problem here.

AB#1984591

Issue Analytics

State:
Created 3 years ago
Comments:6

Top GitHub Comments

1reaction

OscarDPancommented, Jan 13, 2021

FYI this is what I ended up doing:

        def call_java(model, name, *args):
            from pyspark import SparkContext
            from pyspark.ml.common import _java2py, _py2java
            m = getattr(model._java_obj, name)
            sc = SparkContext._active_spark_context
            java_args = [_py2java(sc, arg) for arg in args]
            return _java2py(sc, m(*java_args))

        def get_booster(model):
            return call_java(model, "getModel")

        def get_dump(lightgbm_model: LightGBMClassificationModel):
            jxgb = JavaWrapper(get_booster(lightgbm_model))
            return jxgb._call_java("model")

I think the output of get_dump() in the drive is exactly what I need - only not sure if there’s a better way

0reactions

OscarDPancommented, Jan 13, 2021

Maybe let me put it this way. I want to train a lightGBM with mmlspark/pyspark on a Spark cluster but save/extract the native model and save the binary (lgb.Booster.save_model) locally for future usage.

How should I approach this?

Top Results From Across the Web

Cannot find the saved native string model though SUCCSESS ...

I have done it before with success on azure/databricks cluster, but it seems weird on an azure/k8s deployment.

cannot find type in scope xcode 13 | Apple Developer Forums

I am seeing complier error on WKMediaCaptureType and WKPermissionDecision. I made sure WebKit is imported in WebViewDelegationHandler.swift file. Cannot find ...

How to resolve Nodejs: Error: ENOENT: no such file or directory

It's complaining about a missing file. Do you have "/home/embah/node/nodeapp/config/c onfig.json" in the correct location in your system?

"Could not find stored procedure' even though the stored ...

MyProcCaller' depends on the missing object 'dbo.MyProc'. The module will still be created; however, it cannot run successfully until the object exists.

SSIS package does not run when called from a job step

However, if you do not modify the SSIS package, it will run successfully outside SQL Server Agent. Resolution. To resolve this problem, use...