Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Investigate Pytorch AutoML is 4.6% worse than TF AutoML for MercedesBenzGreener

See original GitHub issue

Investigate Pytorch AutoML 4.6% worse than TF AutoML for MercedesBenzGreener

Background: The MercedesBenzGreener dataset was one of the 12 datasets used to form the AutoML heuristics.

For the MercedesBenzGreener dataset, RMSE for AutoML running for 1 hr using pytorch/master (8.437) is 4.6%
worse than RMSE running for 1 hr using tf/tf-legacy (8.051).  This ticket tracks investigating that difference.
 https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=2057342027

This script can be used to run autoML for 1 hour:
 https://github.com/ludwig-ai/experiments/blob/main/automl/heuristics/mercedes_benz_greener/run_auto_train_1hr.py
The run will produce hyperopt_statistics.json.

This script can be used to post-process the resulting hyperopt_statistics.json to get the test RMSE:
 https://github.com/ludwig-ai/experiments/blob/main/utils/best_hyperopt_statistics.py
For example, if you are in the directory experiments/automl/heuristics/mercedes_benz_greener, run
 python ../../../utils/best_hyperopt_statistics.py hyperopt_statistics.json root_mean_squared_error

Note that MercedesBenzGreener is a small wide dataset and that tf itself is showing signs of overfit in the sanity
check runs of tf AutoML against the full heuristics runs; the validation split is better than the baseline run from the
heuristics search but the test split is performing worse.
 https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=475619705
 This is mentioned in the Ludwig AutoML blog draft
  https://docs.google.com/document/d/1AF1gDGqqaQkNcMKEgrNHgxB275ZBemdta5lk1SBLdxM/edit#

Issue Analytics

State:
Created 2 years ago
Comments:14

Top GitHub Comments

1reaction

dantreimancommented, Feb 15, 2022

Learning curves are much more similar too, with input column types fixed:

pytorch-master

tf-legacy

1reaction

dantreimancommented, Feb 15, 2022

@amholler I think you are right about the input types. I ran this again, using the user_config option to force both versions to use the same input data types.

Good news: both versions land on mostly similar parameters, same batch size, and closer final metric score 60.0685 (tf) vs. 60.6626 (torch)

Code:

with open('user_config.yaml', 'rb') as f:
    user_config = yaml.safe_load(f)

mercedes_benz_greener_df = load_mercedes_benz_greener()

auto_train_results = auto_train(
    dataset=mercedes_benz_greener_df,
    target='y',
    time_limit_s=3600,
    tune_for_memory=False,
    user_config=user_config
)

Pytorch results:

+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------+
| Trial name     | status     | loc               |   combiner.bn_momentum |   combiner.bn_virtual_bs |   combiner.num_steps |   combiner.output_size |   combiner.relaxation_factor |   combiner.size |   combiner.sparsity |   trainer.batch_size |   trainer.decay_rate |   trainer.decay_steps |   trainer.learning_rate |   iter |   total time (s) |   metric_score |
|----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------|
| trial_dd97affc | TERMINATED | 172.31.9.92:436   |                   0.7  |                     2048 |                    4 |                      8 |                          1   |              32 |              0.0001 |                 8192 |                 0.95 |                   500 |                   0.025 |   1740 |        3596      |        64.4081 |
| trial_de0e35c8 | TERMINATED | 172.31.14.97:1457 |                   0.8  |                      256 |                    9 |                    128 |                          1   |              32 |              0      |                 2048 |                 0.95 |                 10000 |                   0.005 |     23 |          75.6883 |       341.896  |
| trial_de201716 | TERMINATED | 172.31.2.131:447  |                   0.98 |                     1024 |                    3 |                      8 |                          1   |               8 |              1e-06  |                 1024 |                 0.8  |                   500 |                   0.005 |     41 |          73.0759 |        91.5639 |
| trial_de909d10 | TERMINATED | 172.31.2.131:536  |                   0.8  |                     1024 |                    7 |                     24 |                          2   |               8 |              0.001  |                  256 |                 0.95 |                 20000 |                   0.01  |     23 |          74.1498 |       131.794  |
| trial_0c1caeb8 | TERMINATED | 172.31.14.97:2986 |                   0.9  |                      256 |                    5 |                     64 |                          1.5 |              64 |              0.0001 |                  512 |                 0.95 |                 20000 |                   0.005 |   1542 |        3514.13   |        63.3714 |
| trial_0ca851e8 | TERMINATED | 172.31.2.131:605  |                   0.6  |                      256 |                    3 |                     16 |                          1.2 |              16 |              0.01   |                 1024 |                 0.9  |                  8000 |                   0.005 |    215 |         361.174  |        69.0541 |
| trial_3ac218de | TERMINATED | 172.31.2.131:656  |                   0.6  |                      512 |                    5 |                     32 |                          2   |               8 |              0      |                 2048 |                 0.95 |                 10000 |                   0.025 |     39 |          72.9415 |       109.335  |
| trial_146bfa64 | TERMINATED | 172.31.2.131:707  |                   0.9  |                     4096 |                    7 |                     24 |                          1.5 |              64 |              0      |                  256 |                 0.9  |                 10000 |                   0.01  |     18 |          74.5411 |       121.671  |
| trial_424b2c16 | TERMINATED | 172.31.2.131:758  |                   0.8  |                      256 |                    3 |                     16 |                          1.2 |              64 |              0.001  |                 4096 |                 0.95 |                  2000 |                   0.005 |     32 |          72.1142 |      3221.51   |
| trial_71a724f6 | TERMINATED | 172.31.2.131:809  |                   0.8  |                      512 |                    9 |                     16 |                          1   |              24 |              0.001  |                  256 |                 0.9  |                  8000 |                   0.02  |    793 |        2838.42   |        60.6626 |
+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------+

tf-legacy results:

+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------+
| Trial name     | status     | loc               |   combiner.bn_momentum |   combiner.bn_virtual_bs |   combiner.num_steps |   combiner.output_size |   combiner.relaxation_factor |   combiner.size |   combiner.sparsity |   training.batch_size |   training.decay_rate |   training.decay_steps |   training.learning_rate |   iter |   total time (s) |   metric_score |
|----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------|
| trial_90f5e990 | TERMINATED | 172.31.1.26:1327  |                   0.7  |                     2048 |                    4 |                      8 |                          1   |              32 |              0.0001 |                  8192 |                  0.95 |                    500 |                    0.025 |   2609 |        3596.22   |        61.6253 |
| trial_9178f98e | TERMINATED | 172.31.15.14:408  |                   0.8  |                      256 |                    9 |                    128 |                          1   |              32 |              0      |                  2048 |                  0.95 |                  10000 |                    0.005 |      1 |         106.677  |     10221.2    |
| trial_ea90e446 | TERMINATED | 172.31.7.225:412  |                   0.98 |                     1024 |                    3 |                      8 |                          1   |               8 |              1e-06  |                  1024 |                  0.8  |                    500 |                    0.005 |     20 |          72.9318 |      8199.69   |
| trial_f5313b62 | TERMINATED | 172.31.7.225:633  |                   0.8  |                     1024 |                    7 |                     24 |                          2   |               8 |              0.001  |                   256 |                  0.95 |                  20000 |                    0.01  |     95 |         362.706  |        63.6851 |
| trial_24aa17a6 | TERMINATED | 172.31.15.14:558  |                   0.9  |                      256 |                    5 |                     64 |                          1.5 |              64 |              0.0001 |                   512 |                  0.95 |                  20000 |                    0.005 |      8 |          74.0691 |       453.44   |
| trial_30bd0454 | TERMINATED | 172.31.15.14:698  |                   0.6  |                      256 |                    3 |                     16 |                          1.2 |              16 |              0.01   |                  1024 |                  0.9  |                   8000 |                    0.005 |     15 |          73.7284 |      2247.54   |
| trial_61b1ddf0 | TERMINATED | 172.31.15.14:852  |                   0.6  |                      512 |                    5 |                     32 |                          2   |               8 |              0      |                  2048 |                  0.95 |                  10000 |                    0.025 |     11 |          74.2399 |       291.301  |
| trial_91d25ed8 | TERMINATED | 172.31.15.14:986  |                   0.9  |                     4096 |                    7 |                     24 |                          1.5 |              64 |              0      |                   256 |                  0.9  |                  10000 |                    0.01  |    783 |        3082.17   |        60.0685 |
| trial_c25540fc | TERMINATED | 172.31.7.225:1227 |                   0.8  |                      256 |                    3 |                     16 |                          1.2 |              64 |              0.001  |                  4096 |                  0.95 |                   2000 |                    0.005 |      8 |          72.9656 |      9512.51   |
| trial_015f5634 | TERMINATED | 172.31.7.225:1346 |                   0.8  |                      512 |                    9 |                     16 |                          1   |              24 |              0.001  |                   256 |                  0.9  |                   8000 |                    0.02  |      4 |          73.7548 |       247.073  |
+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------+

Top Results From Across the Web

automl/Auto-PyTorch: Automatic architecture search ... - GitHub

Automatic architecture search and hyperparameter optimization for PyTorch - GitHub - automl/Auto-PyTorch: Automatic architecture search and hyperparameter ...

Auto-PyTorch - AutoML

Auto -PyTorch achieved state-of-the-art performance on several tabular benchmarks by combining multi-fidelity optimization with portfolio construction for ...

AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch

Automated Machine Learning (AutoML) automates tasks applying machine ... and a learning algorithm, and then let the machine figure it out.

Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient ... - arXiv

We show that Auto-PyTorch Tabular performs as well or even better than several other common AutoML frameworks: AutoKeras, AutoGluon, auto-sklearn and hyperopt- ...

Visualizing Models, Data, and Training with TensorBoard

However, we can do much better than that: PyTorch integrates with TensorBoard, a tool designed for visualizing the results of neural network training...