question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Difference between f1 metric and log_loss metric performance for lightgbm

See original GitHub issue

Ecosystem used:

autogluon.core==0.3.2b20211111 autogluon.extra==0.3.2b20211111 autogluon.features==0.3.2b20211111 autogluon.tabular==0.3.2b20211111

machine: ubuntu, r4.4xlarge

Using almost the same code (with more subsample_size items) that was previously used in this issue:

import pandas as pd
import numpy as np

from autogluon.tabular import TabularDataset, TabularPredictor

path_prefix = 'https://autogluon.s3.amazonaws.com/datasets/Inc/'
path_train = path_prefix + 'train.csv'
path_test = path_prefix + 'test.csv'

train_data = TabularDataset(path_train)
test_data = TabularDataset(path_test)

outputs = []
for subsample_size in [
    5000,
    10000,
    50000,
    100000,
    150000,
    200000,
    400000,
    800000,
   1600000,
   3200000,
   6400000,
]:
    if subsample_size is not None and (subsample_size < len(train_data)):
        train_data_sample = train_data.sample(n=subsample_size, random_state=0).reset_index(drop=True)
    else:
        train_data_sample = train_data.sample(n=subsample_size, random_state=0, replace=True).reset_index(drop=True)

    label = 'class'
    predictor = TabularPredictor(
        label=label,
        verbosity=2,
        problem_type='binary',
        eval_metric='f1',
    ).fit(
        train_data_sample,
        tuning_data=test_data,
        # ag_args_fit={'stopping_metric': 'log_loss'},
        hyperparameters={'GBM': {}, 'XGB': {}}
    )
    output = predictor.leaderboard()
    output['data_size'] = subsample_size
    outputs.append(output)

The following results are obtained:

Screenshot 2022-02-02 at 13 06 21

When I change the metric to log_loss, the results are following:

Screenshot 2022-02-02 at 13 07 32

What one can see is significant reduction in training time for lightgbm for smaller data sizes. However, it goes up significantly for larger ones.

We also encountered this issue when training on our private dataset, with about 2-3x fold reduction in training time for data with 1.4M rows and 1500 columns when using log_loss metric instead of f1 for lightgbm. I’ll provide the shape of data, dtypes and hyperparameters used in training with time comparison in our case:

autogluon_fit_hparams = {'model_feature_type_group_map_special': {'text': ['search_term', 'keyword']},
                           'hyperparameters': {'GBM': {}},
                           'hyperparameter_tune_kwargs': 'auto',
                           'presets': 'medium_quality_faster_train',
                           'num_bag_folds': 2,
                           'num_bag_sets': 2,
                           'keep_only_best': True,
                           'save_space': True,
                           'distill': False,
                           'use_bag_holdout': True,
                           'verbosity': 2,
                           }

Our dataset info:

Screenshot 2022-02-02 at 13 40 12 Screenshot 2022-02-02 at 13 44 03

We also tested version with 100K data rows.

Obtained training times as reported by autogluon:

100K rows: 43 min 8 sec with f1 score (lightgbm) 15 min 30 sec with log_loss (lightgbm)

Full data (1377524 rows): 8 hr 28 min 40 sec with f1 score (lightgbm) 4 hr 6 min 27 sec with log_loss (lightgbm)

So for our data we obtained lower times for lightgbm with log_loss even with almost 1.4M rows, however the difference was less significant than with 100K rows.

Would like to know your comment about these differences, thank you.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
freefeynman123commented, Feb 2, 2022

Ok, I’ll investigate how does it look with the newest features, thank you!

0reactions
Innixmacommented, Mar 7, 2022

Marking this as resolved. Please re-open if the issue persists on pre-release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Difference between f1 metric and log_loss metric performance ...
So for our data we obtained lower times for lightgbm with log_loss even with almost 1.4M rows, however the difference was less significant...
Read more >
24 Evaluation Metrics for Binary Classification (And When to ...
Simply put a classification metric is a number that measures the performance that your machine learning model when it comes to assigning observations...
Read more >
The advantage of log-loss metric over f-score - Cross Validated
log-loss measures the quality of probabilistic predictions, while f-score ignores the probabilistic nature of classification. The focus on ...
Read more >
Choosing the Right Metric for Evaluating Machine Learning ...
This will focus on commonly used metrics in classification, ... F1 Score: It is a harmonic mean of precision and recall given by-...
Read more >
The 24 Essential Evaluation Metrics for Binary Classification ...
You will learn about a bunch of common and lesser-known evaluation metrics and charts to understand how to choose the model performance ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found