question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Upgrading to 1.0.0-rc2 results in a large drop in classification performance using LightGBMClassifier.

See original GitHub issue

Describe the bug Updating mmlspark from 1.0.0-rc1-51-df0244c7-SNAPSHOT to 1.0.0-rc2, while keeping all other aspects of my code the same, results in a large drop in validation Average Precision when using LightGBMClassifier: from 0.574 to 0.313

params = {
  'num_trees': 1000,
  'early_stopping_rounds': 0,
  'feature_fraction': 0.7,
  'l1_reg': 0.0,
  'l2_reg': 0.0,
  'max_depth': -1,
  'num_leaves': 31,
  'is_unbalance': True
}

lgb = LightGBMClassifier(
  featuresCol='features',
  labelCol='Label',
  slotNames=features,
  categoricalSlotNames=idx_cat_cols,
  timeout=12000.0,
  useBarrierExecutionMode=True,
  numIterations=params['num_trees'],
  isUnbalance=params['is_unbalance'],
  earlyStoppingRound=params['early_stopping_rounds'],
  featureFraction=params['feature_fraction'],
  lambdaL1=params['l1_reg'],
  lambdaL2=params['l2_reg'],
  maxDepth=params['max_depth'],
  numLeaves=params['num_leaves']
)

To Reproduce I am seeing this result on a private dataset with 140,000,000 rows and 130 feature columns. I am a Microsoft employee so we can talk offline if more details are needed.

Expected behavior Comparable validation performance between versions.

Info (please complete the following information):

  • MMLSpark Version: 1.0.0-rc2
  • Spark Version: 2.4.5
  • Spark Platform: Databricks (runtime 6.6 ML)

If the bug pertains to a specific feature please tag the appropriate CODEOWNER for better visibility @imatiach-msft

Additional context Did any underlying default settings change?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:4
  • Comments:18 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
saminscommented, Oct 15, 2020

@imatiach-msft is seems to me the package was broken somewhere after this commit (82e7a8eb).

Here is the log of a simple regression task using rc3 (same issue with rc2 but rc1 and the commit I referenced above are ok), I used spark 2.4.7 (scala 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_265) on GCP (debian image)

notice how the l2 loss explodes after one iteration:

20/10/15 08:45:38 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task generating dense dataset with 137086 rows and 100 columns
20/10/15 08:45:44 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task generating dense dataset with 362913 rows and 100 columns
20/10/15 08:45:47 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:06 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 0 with result: 0 and is finished: false
20/10/15 08:46:06 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=3.693259343780418
20/10/15 08:46:06 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:14 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 1 with result: 0 and is finished: false
20/10/15 08:46:14 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=87.718556427486
20/10/15 08:46:14 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:21 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 2 with result: 0 and is finished: false
20/10/15 08:46:21 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=1.0238538051084094E7
20/10/15 08:46:21 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:27 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 3 with result: 0 and is finished: false
20/10/15 08:46:27 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=6.060025082346267E11
20/10/15 08:46:27 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:33 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 4 with result: 0 and is finished: false
20/10/15 08:46:33 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=3.4198023818807816E16
20/10/15 08:46:33 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM task calling LGBM_BoosterUpdateOneIter
20/10/15 08:46:39 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: LightGBM running iteration: 5 with result: 0 and is finished: false
20/10/15 08:46:39 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Valid l2=5.177838695259177E18
20/10/15 08:46:39 INFO com.microsoft.ml.spark.lightgbm.LightGBMRegressor: Early stopping, best iteration is 0
1reaction
brunocouscommented, Mar 11, 2021

The current release v1.0.0-rc3 still has this issue. The v1.0.0-rc1 version is the latest one without it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Upgrading to 1.0.0-rc2 results in a large drop in classification ...
I am seeing this result on a private dataset with 140,000,000 rows and 130 feature columns. I am a Microsoft employee so we...
Read more >
LightGBM Classifier in Python | Kaggle
LightGBM is a fast, distributed, high performance gradient boosting framework based on decision tree algorithms, used for ranking, classification and many other ...
Read more >
Understanding LightGBM Parameters (and How to Tune Them)
How to tune lightGBM parameters in python? Gradient boosting methods. With LightGBM, you can run different types of Gradient boosting methods.
Read more >
lightgbm.LGBMClassifier — LightGBM 3.3.3.99 documentation
LightGBM classifier. ... Construct a gradient boosting model. ... Number of parallel threads to use for training (can be changed at prediction time ......
Read more >
Use LightGBM Classifier and Regressor in Python - ProjectPro
ProjectPro can help you learn how to use use LightGBM Classifier and ... for the model using classification_report and confusion matrix by ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found