question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Investigate Pytorch AutoML is 4.6% worse than TF AutoML for MercedesBenzGreener

See original GitHub issue

Investigate Pytorch AutoML 4.6% worse than TF AutoML for MercedesBenzGreener

Background: The MercedesBenzGreener dataset was one of the 12 datasets used to form the AutoML heuristics.

For the MercedesBenzGreener dataset, RMSE for AutoML running for 1 hr using pytorch/master (8.437) is 4.6%
worse than RMSE running for 1 hr using tf/tf-legacy (8.051).  This ticket tracks investigating that difference.
 https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=2057342027

This script can be used to run autoML for 1 hour:
 https://github.com/ludwig-ai/experiments/blob/main/automl/heuristics/mercedes_benz_greener/run_auto_train_1hr.py
The run will produce hyperopt_statistics.json.

This script can be used to post-process the resulting hyperopt_statistics.json to get the test RMSE:
 https://github.com/ludwig-ai/experiments/blob/main/utils/best_hyperopt_statistics.py
For example, if you are in the directory experiments/automl/heuristics/mercedes_benz_greener, run
 python ../../../utils/best_hyperopt_statistics.py hyperopt_statistics.json root_mean_squared_error

Note that MercedesBenzGreener is a small wide dataset and that tf itself is showing signs of overfit in the sanity
check runs of tf AutoML against the full heuristics runs; the validation split is better than the baseline run from the
heuristics search but the test split is performing worse.
 https://docs.google.com/spreadsheets/d/1c1ghzlNBGH8Sh0AgxZw8Z366NZegiblxOuwQB9eHHf4/edit#gid=475619705
 This is mentioned in the Ludwig AutoML blog draft
  https://docs.google.com/document/d/1AF1gDGqqaQkNcMKEgrNHgxB275ZBemdta5lk1SBLdxM/edit#

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:14

github_iconTop GitHub Comments

1reaction
dantreimancommented, Feb 15, 2022

Learning curves are much more similar too, with input column types fixed:

pytorch-master

tf-legacy

1reaction
dantreimancommented, Feb 15, 2022

@amholler I think you are right about the input types. I ran this again, using the user_config option to force both versions to use the same input data types.

Good news: both versions land on mostly similar parameters, same batch size, and closer final metric score 60.0685 (tf) vs. 60.6626 (torch)

Code:

with open('user_config.yaml', 'rb') as f:
    user_config = yaml.safe_load(f)

mercedes_benz_greener_df = load_mercedes_benz_greener()

auto_train_results = auto_train(
    dataset=mercedes_benz_greener_df,
    target='y',
    time_limit_s=3600,
    tune_for_memory=False,
    user_config=user_config
)

Pytorch results:

+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------+
| Trial name     | status     | loc               |   combiner.bn_momentum |   combiner.bn_virtual_bs |   combiner.num_steps |   combiner.output_size |   combiner.relaxation_factor |   combiner.size |   combiner.sparsity |   trainer.batch_size |   trainer.decay_rate |   trainer.decay_steps |   trainer.learning_rate |   iter |   total time (s) |   metric_score |
|----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------|
| trial_dd97affc | TERMINATED | 172.31.9.92:436   |                   0.7  |                     2048 |                    4 |                      8 |                          1   |              32 |              0.0001 |                 8192 |                 0.95 |                   500 |                   0.025 |   1740 |        3596      |        64.4081 |
| trial_de0e35c8 | TERMINATED | 172.31.14.97:1457 |                   0.8  |                      256 |                    9 |                    128 |                          1   |              32 |              0      |                 2048 |                 0.95 |                 10000 |                   0.005 |     23 |          75.6883 |       341.896  |
| trial_de201716 | TERMINATED | 172.31.2.131:447  |                   0.98 |                     1024 |                    3 |                      8 |                          1   |               8 |              1e-06  |                 1024 |                 0.8  |                   500 |                   0.005 |     41 |          73.0759 |        91.5639 |
| trial_de909d10 | TERMINATED | 172.31.2.131:536  |                   0.8  |                     1024 |                    7 |                     24 |                          2   |               8 |              0.001  |                  256 |                 0.95 |                 20000 |                   0.01  |     23 |          74.1498 |       131.794  |
| trial_0c1caeb8 | TERMINATED | 172.31.14.97:2986 |                   0.9  |                      256 |                    5 |                     64 |                          1.5 |              64 |              0.0001 |                  512 |                 0.95 |                 20000 |                   0.005 |   1542 |        3514.13   |        63.3714 |
| trial_0ca851e8 | TERMINATED | 172.31.2.131:605  |                   0.6  |                      256 |                    3 |                     16 |                          1.2 |              16 |              0.01   |                 1024 |                 0.9  |                  8000 |                   0.005 |    215 |         361.174  |        69.0541 |
| trial_3ac218de | TERMINATED | 172.31.2.131:656  |                   0.6  |                      512 |                    5 |                     32 |                          2   |               8 |              0      |                 2048 |                 0.95 |                 10000 |                   0.025 |     39 |          72.9415 |       109.335  |
| trial_146bfa64 | TERMINATED | 172.31.2.131:707  |                   0.9  |                     4096 |                    7 |                     24 |                          1.5 |              64 |              0      |                  256 |                 0.9  |                 10000 |                   0.01  |     18 |          74.5411 |       121.671  |
| trial_424b2c16 | TERMINATED | 172.31.2.131:758  |                   0.8  |                      256 |                    3 |                     16 |                          1.2 |              64 |              0.001  |                 4096 |                 0.95 |                  2000 |                   0.005 |     32 |          72.1142 |      3221.51   |
| trial_71a724f6 | TERMINATED | 172.31.2.131:809  |                   0.8  |                      512 |                    9 |                     16 |                          1   |              24 |              0.001  |                  256 |                 0.9  |                  8000 |                   0.02  |    793 |        2838.42   |        60.6626 |
+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+----------------------+----------------------+-----------------------+-------------------------+--------+------------------+----------------+

tf-legacy results:

+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------+
| Trial name     | status     | loc               |   combiner.bn_momentum |   combiner.bn_virtual_bs |   combiner.num_steps |   combiner.output_size |   combiner.relaxation_factor |   combiner.size |   combiner.sparsity |   training.batch_size |   training.decay_rate |   training.decay_steps |   training.learning_rate |   iter |   total time (s) |   metric_score |
|----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------|
| trial_90f5e990 | TERMINATED | 172.31.1.26:1327  |                   0.7  |                     2048 |                    4 |                      8 |                          1   |              32 |              0.0001 |                  8192 |                  0.95 |                    500 |                    0.025 |   2609 |        3596.22   |        61.6253 |
| trial_9178f98e | TERMINATED | 172.31.15.14:408  |                   0.8  |                      256 |                    9 |                    128 |                          1   |              32 |              0      |                  2048 |                  0.95 |                  10000 |                    0.005 |      1 |         106.677  |     10221.2    |
| trial_ea90e446 | TERMINATED | 172.31.7.225:412  |                   0.98 |                     1024 |                    3 |                      8 |                          1   |               8 |              1e-06  |                  1024 |                  0.8  |                    500 |                    0.005 |     20 |          72.9318 |      8199.69   |
| trial_f5313b62 | TERMINATED | 172.31.7.225:633  |                   0.8  |                     1024 |                    7 |                     24 |                          2   |               8 |              0.001  |                   256 |                  0.95 |                  20000 |                    0.01  |     95 |         362.706  |        63.6851 |
| trial_24aa17a6 | TERMINATED | 172.31.15.14:558  |                   0.9  |                      256 |                    5 |                     64 |                          1.5 |              64 |              0.0001 |                   512 |                  0.95 |                  20000 |                    0.005 |      8 |          74.0691 |       453.44   |
| trial_30bd0454 | TERMINATED | 172.31.15.14:698  |                   0.6  |                      256 |                    3 |                     16 |                          1.2 |              16 |              0.01   |                  1024 |                  0.9  |                   8000 |                    0.005 |     15 |          73.7284 |      2247.54   |
| trial_61b1ddf0 | TERMINATED | 172.31.15.14:852  |                   0.6  |                      512 |                    5 |                     32 |                          2   |               8 |              0      |                  2048 |                  0.95 |                  10000 |                    0.025 |     11 |          74.2399 |       291.301  |
| trial_91d25ed8 | TERMINATED | 172.31.15.14:986  |                   0.9  |                     4096 |                    7 |                     24 |                          1.5 |              64 |              0      |                   256 |                  0.9  |                  10000 |                    0.01  |    783 |        3082.17   |        60.0685 |
| trial_c25540fc | TERMINATED | 172.31.7.225:1227 |                   0.8  |                      256 |                    3 |                     16 |                          1.2 |              64 |              0.001  |                  4096 |                  0.95 |                   2000 |                    0.005 |      8 |          72.9656 |      9512.51   |
| trial_015f5634 | TERMINATED | 172.31.7.225:1346 |                   0.8  |                      512 |                    9 |                     16 |                          1   |              24 |              0.001  |                   256 |                  0.9  |                   8000 |                    0.02  |      4 |          73.7548 |       247.073  |
+----------------+------------+-------------------+------------------------+--------------------------+----------------------+------------------------+------------------------------+-----------------+---------------------+-----------------------+-----------------------+------------------------+--------------------------+--------+------------------+----------------+
Read more comments on GitHub >

github_iconTop Results From Across the Web

automl/Auto-PyTorch: Automatic architecture search ... - GitHub
Automatic architecture search and hyperparameter optimization for PyTorch - GitHub - automl/Auto-PyTorch: Automatic architecture search and hyperparameter ...
Read more >
Auto-PyTorch - AutoML
Auto -PyTorch achieved state-of-the-art performance on several tabular benchmarks by combining multi-fidelity optimization with portfolio construction for ...
Read more >
AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch
Automated Machine Learning (AutoML) automates tasks applying machine ... and a learning algorithm, and then let the machine figure it out.
Read more >
Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient ... - arXiv
We show that Auto-PyTorch Tabular performs as well or even better than several other common AutoML frameworks: AutoKeras, AutoGluon, auto-sklearn and hyperopt- ...
Read more >
Visualizing Models, Data, and Training with TensorBoard
However, we can do much better than that: PyTorch integrates with TensorBoard, a tool designed for visualizing the results of neural network training...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found