Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problems with the parameter `learning_rate` in HistGradientBoostingClassifier

See original GitHub issue

Describe the bug

Setting the argument learning_rate to a value larger than 0.1 in HistGradientBoostingClassifier encounters large performance degradation.

Steps/Code to Reproduce


import time
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_svmlight_file

from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier

if __name__ == '__main__':
    
    n_estimators = 100
    learning_rate = 0.1
    
    seed = 0
    n_jobs =6
    
    train = load_svmlight_file('../../../Dataset/libsvm/letter_training')
    test = load_svmlight_file('../../../Dataset/libsvm/letter_testing')
    
    X_train, y_train = np.asanyarray(train[0].toarray(), order='F'), train[1]-1
    X_test, y_test = np.asanyarray(test[0].toarray(), order='C'), test[1]-1

    """ XGBoost (Ver==1.1.1) """
    model = XGBClassifier(n_estimators=n_estimators,
                          learning_rate=learning_rate,
                          objective='multi:softmax',
                          random_state=seed,
                          n_jobs=n_jobs)
    
    tic = time.time()
    model.fit(X_train, y_train)
    toc = time.time()
    training_time = toc - tic
    
    tic = time.time()
    y_pred = model.predict(X_test)
    toc = time.time()
    evaluating_time = toc - tic
    
    acc = accuracy_score(y_test, y_pred)
    
    print('XGBoost Testing Acc: {:.4f}%'.format(100.*acc))
    print('XGBoost Training Time: {:.4f} s'.format(training_time))
    print('XGBoost Evaluating Time: {:.4f} s\n'.format(evaluating_time))
    
    """ LightGBM (Ver==2.3.1) """
    model = LGBMClassifier(n_estimators=n_estimators,
                           learning_rate=learning_rate,
                           objective='multiclass',
                           random_state=seed,
                           n_jobs=n_jobs)
    
    tic = time.time()
    model.fit(X_train, y_train)
    toc = time.time()
    training_time = toc - tic
    
    tic = time.time()
    y_pred = model.predict(X_test)
    toc = time.time()
    evaluating_time = toc - tic
    
    acc = accuracy_score(y_test, y_pred)
    
    print('LightGBM Testing Acc: {:.4f}%'.format(100.*acc))
    print('LightGBM Training Time: {:.4f} s'.format(training_time))
    print('LightGBM Evaluating Time: {:.4f} s\n'.format(evaluating_time))

    """ Sklearn-GBDT (Ver==0.22.1) """
    model = HistGradientBoostingClassifier(max_iter=n_estimators,
                                           learning_rate=learning_rate,
                                           validation_fraction=None,
                                           random_state=seed)
    
    tic = time.time()
    model.fit(X_train, y_train)
    toc = time.time()
    training_time = toc - tic
    
    tic = time.time()
    y_pred = model.predict(X_test)
    toc = time.time()
    evaluating_time = toc - tic
    
    acc = accuracy_score(y_test, y_pred)
    
    print('Sklearn Testing Acc: {:.4f}%'.format(100.*acc))
    print('Sklearn Training Time: {:.4f} s'.format(training_time))
    print('Sklearn Evaluating Time: {:.4f} s'.format(evaluating_time))

Expected Results

I expect the performance of HistGradientBoostingClassifier with learning_rate=0.3 to be slightly different from the case with learning_rate=0.1, either better or worse, instead of a huge degradation.

Actual Results

On the letter dataset publicly available in LIBSVM dataset, HistGradientBoostingClassifier achieves a testing accuracy of 95.74% with learning_rate=0.1, yet the accuracy is 6.16% and 6.06% with learning_rate=0.3 and 0.5, separately. Similar situations on other datasets like USPS.

Versions

sklearn: 0.22.1 numpy: 1.18.1 scipy: 1.4.1 Cython: 0.29.15 pandas: 1.0.1 matplotlib: 3.1.3 joblib: 0.14.1

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

xuyxucommented, Jun 29, 2020

I also observe a huge performance degradation on LightGBM and XGBoost after using get_equivalent_model to pass parameters. If this is the expected behavior, this issue can be closed 😃. Thanks.

0reactions

NicolasHugcommented, Jun 29, 2020

I cannot reproduce your results @AaronX121 : much like sklearn, lightgbm gets very degraded performance when using a learning rate higher than 0.1 (I haven’t tried XGBoost) and when setting comparable hyperparameters with get_equivalent_model

I suspect that the discrepancy you have comes from different ways of handling early stopping though I haven’t looked in details.

Also, note that 0.1 seems like the upper limit for the LR: setting it to 0.001 gets you decent results.

Top Results From Across the Web

sklearn.ensemble.HistGradientBoostingClassifier

For multiclass classification problems, 'log_loss' is also known as multinomial deviance or categorical crossentropy. Internally, the model fits one tree per ...

Hyperparameter tuning by grid-search — Scikit-learn course

Here we will use a tree-based model as a classifier (i.e. HistGradientBoostingClassifier ). That means: Numerical variables don't need scaling;. Categorical ...

Gradient Boosting | Hyperparameter Tuning Python

Learn parameter tuning in gradient boosting algorithm using Python ... This technique is followed for a classification problem while a ...

Tune Learning Rate for Gradient Boosting with XGBoost in ...

A problem with gradient boosted decision trees is that they are quick to learn and overfit training data. One effective way to slow...

Deep Dive into scikit-learn's HistGradientBoosting Classifier ...

... we will explore scikit-learn's implementation of histogram-based GBDT called HistGradientBoostingClassifier /Regressor and how it ...