question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Benchmark results with better parameters

See original GitHub issue

Used a laptop for a better demo benchmark:

  • Intel Core i7-7700HQ (4 cores, 8 threads), unthrottled
  • 32GB RAM DDR4 2400 MHz (dual channel)
  • Python 3.6, scikit-learn 0.20, numba 0.40.1

Setup for the proper benchmarking:

  • No LightGBM / pygbm warmup allowed
  • 1 million training samples (10 million might crash on 64GB RAM? pygbm requires at least 24GB RAM for 1 million)
  • 500 training iterations
  • 255 leaves
  • 0.05 learning rate (can change to 0.10 actually for better comparison with independent benchmarks)

The benchmark in the master branch (https://github.com/ogrisel/pygbm/blob/master/benchmarks/bench_higgs_boson.py) is way too short and doesn’t exactly test the speed of whole model due to how fast it is: there are diminishing returns when the number of iterations increases, and this is what is difficult to optimize once the tree construction is already optimized.

Results:

Model Time AUC Comments
LightGBM 45.260s 0.8293 Reference, runnable with 8GB RAM.
pygbm 359.101s 0.8180 Requires over 24GB RAM.
Slower as more trees are added over time.

Conclusion:

  • pygbm is 5 to 10 times slower, but don’t consider because it is slower it is worse. It is actually very fast if we compare to 2 years ago with xgboost with exact method, and as of today we can consider it competitive in speed with xgboost exact if you have enough RAM
  • pygbm requires way too much RAM, you will notice it only when using many iterations because it seems to increase linearly

To run the benchmark, one can use the following for a clean setup, not optimized for fastest performance but you have the pre-requisites (0.20 scikit-learn, 0.39 numba):

pip install lightgbm
pip install -U scikit-learn
pip install -U numba

git clone https://github.com/ogrisel/pygbm.git
cd pygbm

Before installing pygbm, change the following in line 147 of pygbm/grower (https://github.com/ogrisel/pygbm/blob/master/pygbm/grower.py#L146-L147):

            node.construction_speed = (node.sample_indices.shape[0] /
                                       node.find_split_time)

to:

            node.construction_speed = (node.sample_indices.shape[0] / 1.0)

Allows to avoid the infamous divide by zero error.

Then, one can run the following:

pip install --editable .

If you have slow Internet, download HIGGS dataset: https://archive.ics.uci.edu/ml/machine-learning-databases/00280/ then uncompress it.

Then, you may run a proper benchmark using the following (make sure to change the load_path to your HIGGS csv file):

import os
from time import time
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from pygbm import GradientBoostingMachine
from lightgbm import LGBMRegressor
import numba
import gc


n_leaf_nodes = 255
n_trees = 500
lr = 0.05
max_bins = 255
load_path = "mnt/HIGGS/HIGGS.csv"
subsample = 1000000 # Change this to 10000000 if you wish, or to None

df = pd.read_csv(load_path, header=None, dtype=np.float32)
target = df.values[:, 0]
data = np.ascontiguousarray(df.values[:, 1:])
data_train, data_test, target_train, target_test = train_test_split(
    data, target, test_size=50000, random_state=0)

if subsample is not None:
    data_train, target_train = data_train[:subsample], target_train[:subsample]

n_samples, n_features = data_train.shape
print(f"Training set with {n_samples} records with {n_features} features.")

# Includes warmup time penalty
print("Fitting a LightGBM model...")
tic = time()
lightgbm_model = LGBMRegressor(n_estimators=n_trees, num_leaves=n_leaf_nodes,
                               learning_rate=lr, silent=False)
lightgbm_model.fit(data_train, target_train)
toc = time()
predicted_test = lightgbm_model.predict(data_test)
roc_auc = roc_auc_score(target_test, predicted_test)
print(f"done in {toc - tic:.3f}s, ROC AUC: {roc_auc:.4f}")
del lightgbm_model
del predicted_test
gc.collect()

# Includes warmup time penalty
print("Fitting a pygbm model...")
tic = time()
pygbm_model = GradientBoostingMachine(learning_rate=lr, max_iter=n_trees,
                                      max_bins=max_bins,
                                      max_leaf_nodes=n_leaf_nodes,
                                      random_state=0, scoring=None,
                                      verbose=1, validation_split=None)
pygbm_model.fit(data_train, target_train)
toc = time()
predicted_test = pygbm_model.predict(data_test)
roc_auc = roc_auc_score(target_test, predicted_test)
print(f"done in {toc - tic:.3f}s, ROC AUC: {roc_auc:.4f}")
del pygbm_model
del predicted_test
gc.collect()


if hasattr(numba, 'threading_layer'):
    print("Threading layer chosen: %s" % numba.threading_layer())

If something is missing in the script, please let me know.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:23 (17 by maintainers)

github_iconTop GitHub Comments

1reaction
ogriselcommented, Nov 2, 2018

@dhirschfeld there are no nested prange loops in pygbm so far and we don’t do any linear algebra, numpy is just used as a passive datastructure (no BLAS routines used) so composability is probably useless in this context.

1reaction
guolinkecommented, Nov 2, 2018
Read more comments on GitHub >

github_iconTop Results From Across the Web

A Guide to Benchmarks and Benchmark Scores - UL Solutions
A benchmark is simply a test that helps you compare similar products. Each of our benchmarks produces a score. The higher the score,...
Read more >
Better Benchmarking: Preparing and Evaluating Benchmarks
Improve Performance with Better Benchmarking Results ... Benchmarking is all about getting reliable results and using those results as a baseline ...
Read more >
Benchmarks - Hugging Face
There are many more parameters that can be configured via the benchmark argument data classes. ... results = benchmark.run() >>> print(results) ...
Read more >
How to Read and Understand CPU Benchmarks - Intel
Lists of scores can be found on review sites like Tom's Hardware. Before upgrading the CPU. Compare benchmarks for various CPUs on a...
Read more >
Parameters used for the benchmarks and their possible ...
The parameters listed under all are present in all the benchmarks. from ... results in dramatically better performance than other kernel configurations.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found