Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

In lightgbm_tuner_simple.py example early stopping is not working properly.

See original GitHub issue

In lightgbm_tuner_simple.py example, early stopping is not working properly as follows. For example, I got the following log by execute lightghtgbm_tuner_simple.py.

# First trial
[1]	valid_0's binary_logloss: 0.581604	valid_1's binary_logloss: 0.587863
...
[43]	valid_0's binary_logloss: 0.0236828	valid_1's binary_logloss: 0.145822
...
[143]	valid_0's binary_logloss: 4.13863e-05	valid_1's binary_logloss: 0.265754
Early stopping, best iteration is:
[43]	valid_0's binary_logloss: 0.0236828	valid_1's binary_logloss: 0.145822
# Early stopping works fine in first trial.

# Second trial
[1]	valid_0's binary_logloss: 0.580784	valid_1's binary_logloss: 0.586189
[2]	valid_0's binary_logloss: 0.514544	valid_1's binary_logloss: 0.524775
...
[43]	valid_0's binary_logloss: 0.0207709	valid_1's binary_logloss: 0.149757
...
[143]	valid_0's binary_logloss: 3.01071e-05	valid_1's binary_logloss: 0.318618
Early stopping, best iteration is:
[43]	valid_0's binary_logloss: 0.0236828	valid_1's binary_logloss: 0.145822
# Early stopping does not work correctly in second trial. It seems to the result of the first trial.

I think this case is because early_stopping() function creates a closure and variables in the closure are shared between each trials. If I use early_stopping_rounds parameter instead of early_stopping callback, early stopping works properly even though the following warning is displayed.

UserWarning: 'early_stopping_rounds' argument is deprecated and will be removed in a future release of LightGBM. Pass 'early_stopping()' callback via 'callbacks' argument instead.
  _log_warning("'early_stopping_rounds' argument is deprecated and will be removed in a future release of LightGBM. "

Environment

Optuna version: 2.10.0
Python version: 3.8.18
OS: Ubuntu 20.04.2

Issue Analytics

State:
Created 2 years ago
Comments:11

Top GitHub Comments

1reaction

nzw0301commented, Dec 6, 2021

Thank you for reporting the bug! Indeed, I could reproduce the same behaviour with the latest lightgbm on colab notebook. I think this issue might be related optuna.integration.lightgbm not example. So I’ll transfer this issue to the optuna/optuna repo.

Code I used:

!pip install -U lightgbm optuna

# lightgbm.__version__, optuna.__version__ == ('3.3.1', '2.10.0')


import numpy as np
import optuna.integration.lightgbm as lgb

from lightgbm import early_stopping
from lightgbm import log_evaluation
import sklearn.datasets
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

data, target = sklearn.datasets.load_breast_cancer(return_X_y=True)
train_x, val_x, train_y, val_y = train_test_split(data, target, test_size=0.25)
dtrain = lgb.Dataset(train_x, label=train_y)
dval = lgb.Dataset(val_x, label=val_y)

params = {
    "objective": "binary",
    "metric": "binary_logloss",
    "verbosity": -1,
    "boosting_type": "gbdt",
}

model = lgb.train(
    params,
    dtrain,
    valid_sets=[dtrain, dval],
    callbacks=[early_stopping(100), log_evaluation(100)],
    # early_stopping_rounds=100
)

prediction = np.rint(model.predict(val_x, num_iteration=model.best_iteration))
accuracy = accuracy_score(val_y, prediction)

best_params = model.params
print("Best params:", best_params)
print("  Accuracy = {}".format(accuracy))
print("  Params: ")
for key, value in best_params.items():
    print("    {}: {}".format(key, value))

0reactions

github-actions[bot]commented, Oct 2, 2022

This issue was closed automatically because it had not seen any recent activity. If you want to discuss it, you can reopen it freely.