question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to only train some of the models in TabularPrediction.fit()

See original GitHub issue

In my task the neural network component is adding a lot to the training time - which isn’t a big deal - and a lot to the inference time - which is more important. The NN component isn’t adding a lot to the already excellent performance of the axis-aligned models in this case.

I’d like to fit the model without the NN. It appears that models can be specified in the hyperparameter option as

hyperparameters={'custom':['GBM']}

but that only GBM is supported at the moment. Can this be extended to the other models?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
jwmuellercommented, Apr 2, 2020

Note: if you mainly care about prediction time (inference latency), you also don’t necessarily need to prevent AutoGluon from training all of its models. If you don’t require a high-accuracy ensemble, you can instead just tell AutoGluon to use the fastest of these trained models at inference time, like this:

First, invoke your default predictor = task.fit() call without specifying the auto_stack, hyperparameter_tune or bagging-related arguments. Here there is no need to specify the hyperparameters dict, so AutoGluon will train all of its models (if the fit() time is ok with you; if you find the fit() time too long, then specify only a subset of the models in the hyperparameters dict as I outlined in the previous comment).

Second, call predictor.leaderboard(only_pareto_frontier=True) and find the model with the smallest value of ‘pred_time_val’ (this model is the fastest for predictions), suppose its name is: MODEL_NAME.

Note: make sure you’ve installed the latest version of autogluon from the master branch to access this functionality.

To use solely this faster model for your predictions, simply call: predictor.predict(test_data, model = MODEL_NAME)

If you care about online inference with individual test data-points one-at-a-time, then you should first persist the models in memory like this (Note: this functionality is still under active development and not documented):

predictor._learner.persist_trainer()
pred1 = predictor.predict(test_datapoint1, model=MODEL_NAME)
pred2 = predictor.predict(test_datapoint2, model=MODEL_NAME)
...
3reactions
jwmuellercommented, Apr 2, 2020

Apologies we should clarify our TabularPrediction documentation. The hyperparameters dict currently has the following possible keys: ‘NN’, ‘GBM’, ‘CAT’, ‘RF’, ‘XT’, ‘KNN’, ‘custom’ corresponding to 7 different models that can be trained. To omit one of these models during task.fit(), you simply drop that key from the
hyperparameters dict. For example, you can tell AutoGluon to only consider RF and GBM models (with their default hyperparameter settings) via:

task.fit(..., hyperparameters={'RF':{}, 'GBM':{}})

The ‘custom’ key offers different functionality than you described. You should not modify the GBM value for this key, but you can drop the key from hyperparameters if you’d like to omit the custom model. If you include ‘custom’ in hyperparameters, then AutoGluon will train additional custom models. For now, the only additional custom model is a second GBM model with different (preset) hyperparameter settings than the default GBM model that gets trained when hyperparameters contains the GBM key.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to only train some of the models in TabularPrediction.fit()
Here there is no need to specify the hyperparameters dict, so AutoGluon will train all of its models (if the fit() time is...
Read more >
FAQ — AutoGluon Documentation 0.6.2 documentation
Most of the models used by AutoGluon support GPU training, ... particular type by specifying its short-name in the excluded_model_types argument of fit()...
Read more >
Predicting Columns in a Table - In Depth - AutoGluon
fit() trains neural networks and various types of tree ensembles by default. You can specify various hyperparameter values for each type of model....
Read more >
autogluon.tabular.models
Models trained by TabularPredictor can have suffixes in their names that ... This only impacts model.score() , as eval_metric is not used during...
Read more >
AutoGluon Tasks
Prediction tasks built into AutoGluon such that a single call to fit() can produce high-quality trained models. For other applications, you can still...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found