question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

custom objective function for pyspark

See original GitHub issue

Hi, I can see that the custom objective function for the Scala API was recently added in this PR, which is really exciting! Is there any idea when this functionally will be added in pyspark (perhaps it has and I haven’t found the PR yet)?

I’m very interested in implementing a custom objective function for the LightGBMRanker model using mean average precision (trying to follow the approach in this paper) which is suited for binary relevance, as the current ‘lambdarank’ function uses NDCG which is best suited for graded relevance measure. It would be nice to have this feature as the xgboost python package has the option to use the rank:map objective in addition to the default rank:ndcg.

Thanks so much! We’ve been using your model at our company for the past year, but our training data is binary not graded, and I’d love to use something better suited to our data!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
yukihiro123commented, Dec 12, 2022

Is there a way to use custom objective function in pyspark already implemented? When setting a python custom objective function to the fobj argument of LightGBMClassifier, the following error was output.

def fobj(pred, label):
    ...
    return grad, hess
lgbm = LightGBMClassifier (fobj=fobj)
model = lgbm.fit(train_sdf)

java.lang.ClassCastException: class net.razorvine.pickle.objects.ClassDictConstructor cannot be cast to class com.microsoft.azure.synapseml.lightgbm.params.FObjTrait

I understand that an error occurred when converting python object to the FObjTrait type.

If there is a way to use your own objective function in pyspark, thank you for giving me a specific example

1reaction
imatiach-msftcommented, Sep 2, 2021

@andrew-arkhipov it is supported in the scala API, see param here: https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/main/scala/com/microsoft/ml/spark/lightgbm/params/LightGBMParams.scala#L305 see here for param definition: https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/main/scala/com/microsoft/ml/spark/lightgbm/params/FObjParam.scala see example here in scala: https://github.com/microsoft/SynapseML/blob/master/lightgbm/src/test/scala/com/microsoft/ml/spark/lightgbm/split1/VerifyLightGBMClassifier.scala#L338

It’s not yet supported in pyspark because there is no easy way to call the python process from scala worker for an arbitrary function like this. I think I have to look into the interprocess communication code from apache spark to figure out how to enable this scenario.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Custom Objective and Evaluation Metric
This document introduces implementing a customized elementwise evaluation metric and objective for XGBoost. Although the introduction uses Python for ...
Read more >
How to set a custom loss function in Spark MLlib
So, you can simply write a class like LeastSquaresGradient and implement the compute function and use it in your LinearRegressionWithSGD model.
Read more >
Linear Methods - MLlib - Spark 1.2.1 Documentation
The objective function f has two parts: the regularizer that controls the complexity of the model, and the loss that measures the error...
Read more >
Custom Objective for LightGBM | Hippocampus's Garden
We need to define two functions. One returns the first- and second-order derivatives and is used to train the model. The other is...
Read more >
XGBclassifier with custom objective function gives different ...
Databricks updated the spark version to 3.1.2 (with python 3.8) and my code started to give only 0 predictions. Data is same, code...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found