Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Generic benchmarking/profiling tool

See original GitHub issue

We have not been proficient at documenting the estimated runtime or space complexity of our estimators and algorithms. Even were we to document asymptotic complexity functions, it would not give a realistic estimate for all parameter settings, etc. for a particular kind of data. Rather we could assist users in estimating complexity functions empirically.

I would like to see a function something like the following:

def benchmark_estimator_cost(est, X, y=None, fit_params=None,
                             vary_n_samples=True, vary_n_features=False,
                             n_fits=100, time_budget=300, profile_memory=True):
    """Profiles the cost of fitting est on samples of different size

    Parameters
    ----------
    est : estimator
    X : array-like
    y : array-like, optional
    fit_params : dict, optional
    vary_n_samples : bool, default=True
        Whether to benchmark for various random sample sizes.
    vary_n_features : bool, default=False
        Whether to benchmark for various random feature set sizes.
    n_fits : int, default=100
        Maximum number of fits to make while benchmarking.
    time_budget : int, default=300
        Maximum number of seconds to use overall.  Current fit will
        be stopped if the budget is exceeded.
    profile_memory : bool, default=True
        Whether to include memory (or just time) profiling. Memory
        profiling will slow down fitting, and hence make fit_time
        estimates more approximate.

    Returns
    -------
    results : dict
        The following keys are each mapped to an array:

        n_samples
            The number of samples
        n_features
            The number of samples
        fit_time
            In seconds
        peak_memory
            The memory used at peak of fitting, in KiB.
        model_memory
            The memory in use at the end of fitting, minus that at the
            beginning, in KiB.

    models : dict
        keys 'peak_memory', 'model_memory' and 'fit_time' map to polynomial
        GP regressors whose input is n_samples and n_features and whose
        outputs are each of those targets.

    errors : list of dicts
        lists the parameters that resulted in exceptions
    """

This would run fit successively for different values of n_samples (logarithmically spaced, perhaps guided by a gaussian process) to estimate the function for fitting complexity, within budget. I have not thought extensively about exactly what sampling strategy would be followed. If this is implemented for the library, we would consider experimental and the algorithm subject to change for a little while.

What do others think?

Issue Analytics

State:
Created 6 years ago
Reactions:9
Comments:32 (32 by maintainers)

Top GitHub Comments

1reaction

vrishank97commented, Aug 25, 2020

@jnothman #17026 Is an implementation of a benchmarking tool for the sample datasets we use in the sklearn examples, it doesn’t exactly cover the use case that was in mind for this profiling tool, which was intended to be used to model the change in performance of estimators as their hyperparams change.

0reactions

jnothmancommented, Aug 25, 2020

I’m not sure how well #17026 solves the need of a user estimating how well an algorithm will scale on their specific data. If it does, a tutorial would be beneficial!

Top Results From Across the Web

TURNKEY Organizational Benchmarking Tool (Govt)

This chart summarizes the benchmarking process and key steps. This tool includes generic comparison dimensions and criteria to be used as a starting...

C++ Profiling and Benchmarking Tools in 2022 | hacking C++

An opinionated list of popular performance and memory profiling tools for C++ in 2022.

Benchmarking - Chartered Global Management Accountant

Functional Benchmarking (also known as operational or generic benchmarking) compares internal functions with those of the best external ...

The Different Types Of Benchmarking – Examples And Easy ...

Benchmarks are reference points that you use to compare your performance against the performance of others. These benchmarks can be comparing processes, ...

What Is Generic Benchmarking? (Plus Other ... - Indeed

Generic benchmarking is the process of analyzing two companies in different industries and comparing their general business functions. This ...