question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

nightly Ludwig seems use ray air API incorrectly

See original GitHub issue

Describe the bug

hyperopt throws an error and seems it’s incorrect usage on ray air API.

To Reproduce Steps to reproduce the behavior:

    # TODO: write our own data loader
    mushroom_edibility_df = mushroom_edibility.load()

    automl_config = create_auto_config(
        dataset=mushroom_edibility_df,
        target=target_column,
        time_limit_s=time_limit_s,
        tune_for_memory=tune_for_memory,
        user_config=None,
        random_seed=default_random_seed,
        use_reference_config=False,
    )

    auto_train_results = train_with_config(
        dataset=mushroom_edibility_df,
        config=automl_config,
        output_directory=output_dir,
        random_seed= default_random_seed,
    )

logs


2022-09-13 21:02:26,099	INFO timeout.py:54 -- Reached timeout of 180 seconds. Stopping all trials.
== Status ==
Current time: 2022-09-13 21:02:26 (running for 00:03:00.88)
Memory usage on this node: 3.6/7.7 GiB
Using AsyncHyperBand: num_stopped=2
Bracket: Iter 72.000: 1.0
Resources requested: 0/6 CPUs, 0/0 GPUs, 0.0/4.36 GiB heap, 0.0/2.18 GiB objects
Current best trial: f5a3c492 with metric_score=1.0 and parameters={'trainer.learning_rate': 0.005, 'trainer.decay_rate': 0.8, 'trainer.decay_steps': 500, 'combiner.size': 16, 'combiner.output_size': 16, 'combiner.num_steps': 4, 'combiner.relaxation_factor': 1.0, 'combiner.sparsity': 0.0001, 'combiner.bn_virtual_bs': 256, 'combiner.bn_momentum': 0.2}
Result logdir: /tmp/automl/artifact/hyperopt
Number of trials: 9/10 (9 TERMINATED)
+----------------+------------+----------------+------------------------+------------------------+----------------------+------------------------+------------------------+-----------------+---------------------+----------------------+-----------------------+------------------------+--------+------------------+----------------+
| Trial name     | status     | loc            |   combiner.bn_momentum |   combiner.bn_virtu... |   combiner.num_steps |   combiner.output_size |   combiner.relaxati... |   combiner.size |   combiner.sparsity |   trainer.decay_rate |   trainer.decay_steps |   trainer.learning_... |   iter |   total time (s) |   metric_score |
|----------------+------------+----------------+------------------------+------------------------+----------------------+------------------------+------------------------+-----------------+---------------------+----------------------+-----------------------+------------------------+--------+------------------+----------------|
| trial_f2fdbd9c | TERMINATED | 172.17.0.2:362 |                   0.05 |                    512 |                    3 |                      8 |                    2   |              16 |              0.1    |                 0.95 |                   500 |                  0.025 |     54 |        173.326   |       0.997537 |
| trial_f59ff54c | TERMINATED | 172.17.0.2:396 |                   0.4  |                   2048 |                    9 |                      8 |                    1.5 |               8 |              0.001  |                 0.95 |                 10000 |                  0.005 |     13 |         76.7784  |       0.985222 |
| trial_f5a3c492 | TERMINATED | 172.17.0.2:398 |                   0.2  |                    256 |                    4 |                     16 |                    1   |              16 |              0.0001 |                 0.8  |                   500 |                  0.005 |     44 |        167.486   |       1        |
| trial_f5a82dd4 | TERMINATED | 172.17.0.2:400 |                   0.2  |                   2048 |                    6 |                    128 |                    1   |              24 |              0      |                 0.95 |                 20000 |                  0.01  |     24 |        159.313   |       1        |
| trial_f5acaa94 | TERMINATED | 172.17.0.2:404 |                   0.4  |                    512 |                    9 |                     32 |                    2   |              24 |              1e-06  |                 0.95 |                 20000 |                  0.005 |     23 |        162.771   |       1        |
| trial_f5b17f6a | TERMINATED | 172.17.0.2:406 |                   0.3  |                    256 |                    4 |                     24 |                    1.2 |              32 |              0.0001 |                 0.9  |                  8000 |                  0.005 |     45 |        167.616   |       1        |
| trial_f5b87b3a | TERMINATED | 172.17.0.2:725 |                   0.3  |                    512 |                    7 |                    128 |                    1   |               8 |              0.001  |                 0.95 |                 10000 |                  0.025 |      8 |         72.1376  |       0.997537 |
| trial_285573b8 | TERMINATED | 172.17.0.2:892 |                   0.02 |                   1024 |                    6 |                     32 |                    2   |               8 |              0      |                 0.9  |                 10000 |                  0.01  |      1 |          5.96841 |       0.561576 |
| trial_582786d0 | TERMINATED |                |                   0.4  |                    256 |                    4 |                     16 |                    2   |              24 |              0.01   |                 0.95 |                  2000 |                  0.005 |        |                  |                |
+----------------+------------+----------------+------------------------+------------------------+----------------------+------------------------+------------------------+-----------------+---------------------+----------------------+-----------------------+------------------------+--------+------------------+----------------+


2022-09-13 21:04:26,204	INFO tune.py:758 -- Total run time: 300.97 seconds (180.84 seconds for the tuning loop).
Traceback (most recent call last):
  File "auto_train.py", line 182, in <module>
    run_job(args.dataset, args.target, args.time_limit_s, args.tune_for_memory ,args.output_directory)
  File "auto_train.py", line 48, in run_job
    auto_train_results = train_with_config(
  File "/usr/local/lib/python3.8/site-packages/ludwig/automl/automl.py", line 209, in train_with_config
    hyperopt_results = _train(
  File "/usr/local/lib/python3.8/site-packages/ludwig/automl/automl.py", line 308, in _train
    hyperopt_results = hyperopt(
  File "/usr/local/lib/python3.8/site-packages/ludwig/hyperopt/run.py", line 329, in hyperopt
    hyperopt_results = hyperopt_executor.execute(
  File "/usr/local/lib/python3.8/site-packages/ludwig/hyperopt/execution.py", line 828, in execute
    self._evaluate_best_model(
  File "/usr/local/lib/python3.8/site-packages/ludwig/hyperopt/execution.py", line 385, in _evaluate_best_model
    os.path.join(best_model_path, "model"),
  File "/usr/local/lib/python3.8/posixpath.py", line 76, in join
    a = os.fspath(a)
  File "/usr/local/lib/python3.8/site-packages/ray/air/checkpoint.py", line 616, in __fspath__
    raise TypeError(
TypeError: You cannot use `air.Checkpoint` objects directly as paths. Use `Checkpoint.to_directory()` or `Checkpoint.as_directory()` instead.

Expected behavior The job should finish successfully.

Screenshots

image

Environment (please complete the following information):

  • OS: [e.g. iOS] linux
  • Version [e.g. 22] same as ludwig container image
  • Python version 3.8
  • Ludwig version ludwig:release-0.6 ludwig:nightly

Additional context Add any other context about the problem here.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
tgaddaircommented, Sep 14, 2022

@Jeffwan #2485 should address the issue. Let me know if you still run into any issues!

0reactions
tgaddaircommented, Sep 14, 2022

Thanks for verifying @Jeffwan! I’ve now merged this into the 0.6 release branch in #2491.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Incorrect formatting of visualization api calls in Ludwig ...
Describe the bug Most visualization api parameters documentation are not correctly formatted. To Reproduce Example of incorrect formatting ...
Read more >
Pulse · ludwig-ai/ludwig · GitHub
[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error. #2452 closed Sep 14, 2022. nightly Ludwig seems use ray air API...
Read more >
nwL - River Thames Conditions - Environment Agency - GOV.UK
Iowa football undefeated video, Cod-bo2, Tiemeyer mccain church! Compteur edf u1c5, Water air soil pollution abbreviation! Tomax maglieria piano d'api.
Read more >
U5L1-Spell-Checker.xml - The Beauty and Joy of Computing
Take any number of input lists, and create a new list containing the items of the input lists. So APPEND [A B] [C...
Read more >
qandamaster410.xml - Chegg
... https://www.chegg.com/homework-help/questions-and-answers/air-95-f-1-atm- ... -2k-1-2-k-1-n-problem-says-use-alhazen-s-formula-says-k-1-n--q27981419 0.8 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found