question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot find the saved native string model though SUCCSESS files are created

See original GitHub issue

Describe the bug I trained a mmlspark.lightgbm regressor which works greats and generates outputs on the test set. When saving it in a local directory on an azure/k8s pod using the following command:

reg_fitted.saveNativeModel(‘./savemodel/model02/’,overwrite=True)

I do not get the txt file containing the model parameters. I can see the content of that directory which has the _SUCCESS file, but cannot find the txt file that contains the model itself:

ls savemodel/model02/ -la

total 12 drwxr-xr-x 2 root root 4096 Jan 6 22:37 . drwxr-xr-x 4 root root 4096 Jan 6 22:37 … -rw-r–r-- 1 root root 8 Jan 6 22:37 ._SUCCESS.crc -rw-r–r-- 1 root root 0 Jan 6 22:37 _SUCCESS

To Reproduce Not sure how you can exactly reproduce this issue since no error is generated but nothing is saved either.

Expected behavior I expected to see a txt file in the same directory that describes the model. I have done it before with success on azure/databricks cluster, but it seems weird on an azure/k8s deployment.

Info (please complete the following information):

  • MMLSpark Version: [e.g. v0.18.1]
  • Spark Version [e.g. 2.4.4]
  • Spark Platform [e.g. Azure/k8s]

** Stacktrace**

Please post the stacktrace here if applicable

If the bug pertains to a specific feature please tag the appropriate CODEOWNER for better visibility

Additional context Add any other context about the problem here.

AB#1984591

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
OscarDPancommented, Jan 13, 2021

FYI this is what I ended up doing:

        def call_java(model, name, *args):
            from pyspark import SparkContext
            from pyspark.ml.common import _java2py, _py2java
            m = getattr(model._java_obj, name)
            sc = SparkContext._active_spark_context
            java_args = [_py2java(sc, arg) for arg in args]
            return _java2py(sc, m(*java_args))

        def get_booster(model):
            return call_java(model, "getModel")

        def get_dump(lightgbm_model: LightGBMClassificationModel):
            jxgb = JavaWrapper(get_booster(lightgbm_model))
            return jxgb._call_java("model")

I think the output of get_dump() in the drive is exactly what I need - only not sure if there’s a better way

0reactions
OscarDPancommented, Jan 13, 2021

Maybe let me put it this way. I want to train a lightGBM with mmlspark/pyspark on a Spark cluster but save/extract the native model and save the binary (lgb.Booster.save_model) locally for future usage.

How should I approach this?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot find the saved native string model though SUCCSESS ...
I have done it before with success on azure/databricks cluster, but it seems weird on an azure/k8s deployment.
Read more >
cannot find type in scope xcode 13 | Apple Developer Forums
I am seeing complier error on WKMediaCaptureType and WKPermissionDecision. I made sure WebKit is imported in WebViewDelegationHandler.swift file. Cannot find ...
Read more >
How to resolve Nodejs: Error: ENOENT: no such file or directory
It's complaining about a missing file. Do you have "/home/embah/node/nodeapp/config/c onfig.json" in the correct location in your system?
Read more >
"Could not find stored procedure' even though the stored ...
MyProcCaller' depends on the missing object 'dbo.MyProc'. The module will still be created; however, it cannot run successfully until the object exists.
Read more >
SSIS package does not run when called from a job step
However, if you do not modify the SSIS package, it will run successfully outside SQL Server Agent. Resolution. To resolve this problem, use...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found