Investigate Pytorch AutoML is 4.6% worse than TF AutoML for MercedesBenzGreener
See original GitHub issueInvestigate Pytorch AutoML 4.6% worse than TF AutoML for MercedesBenzGreener
Background: The MercedesBenzGreener dataset was one of the 12 datasets used to form the AutoML heuristics.
For the MercedesBenzGreener dataset, RMSE for AutoML running for 1 hr using pytorch/master (8.437) is 4.6%
worse than RMSE running for 1 hr using tf/tf-legacy (8.051). This ticket tracks investigating that difference.
https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=2057342027
This script can be used to run autoML for 1 hour:
https://github.com/ludwig-ai/experiments/blob/main/automl/heuristics/mercedes_benz_greener/run_auto_train_1hr.py
The run will produce hyperopt_statistics.json.
This script can be used to post-process the resulting hyperopt_statistics.json to get the test RMSE:
https://github.com/ludwig-ai/experiments/blob/main/utils/best_hyperopt_statistics.py
For example, if you are in the directory experiments/automl/heuristics/mercedes_benz_greener, run
python ../../../utils/best_hyperopt_statistics.py hyperopt_statistics.json root_mean_squared_error
Note that MercedesBenzGreener is a small wide dataset and that tf itself is showing signs of overfit in the sanity
check runs of tf AutoML against the full heuristics runs; the validation split is better than the baseline run from the
heuristics search but the test split is performing worse.
https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=475619705
This is mentioned in the Ludwig AutoML blog draft
https://docs.google.com/document/d/1AF1gDGqqaQkNcMKEgrNHgxB275ZBemdta5lk1SBLdxM/edit#
Issue Analytics
- State:
- Created 2 years ago
- Comments:14
Top Results From Across the Web
automl/Auto-PyTorch: Automatic architecture search ... - GitHub
Automatic architecture search and hyperparameter optimization for PyTorch - GitHub - automl/Auto-PyTorch: Automatic architecture search and hyperparameter ...
Read more >Auto-PyTorch - AutoML
Auto -PyTorch achieved state-of-the-art performance on several tabular benchmarks by combining multi-fidelity optimization with portfolio construction for ...
Read more >AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch
Automated Machine Learning (AutoML) automates tasks applying machine ... and a learning algorithm, and then let the machine figure it out.
Read more >Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient ... - arXiv
We show that Auto-PyTorch Tabular performs as well or even better than several other common AutoML frameworks: AutoKeras, AutoGluon, auto-sklearn and hyperopt- ...
Read more >Visualizing Models, Data, and Training with TensorBoard
However, we can do much better than that: PyTorch integrates with TensorBoard, a tool designed for visualizing the results of neural network training...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Learning curves are much more similar too, with input column types fixed:
@amholler I think you are right about the input types. I ran this again, using the
user_config
option to force both versions to use the same input data types.Good news: both versions land on mostly similar parameters, same batch size, and closer final metric score 60.0685 (tf) vs. 60.6626 (torch)
Code:
Pytorch results:
tf-legacy results: