question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Core][Bug] decision_tree_autoscaling_20_runs is flaky

See original GitHub issue

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Core

What happened + What you expected to happen

Looks worker crashed during start up?

2021-12-28 04:16:51,152 INFO worker.py:853 -- Connecting to existing Ray cluster at address: 172.31.29.225:6379
(best_split_for_idx_remote pid=288, ip=172.31.45.66) 2021-12-28 04:20:36,449    ERROR worker.py:432 -- SystemExit was raised from the worker.
(best_split_for_idx_remote pid=288, ip=172.31.45.66) Traceback (most recent call last):
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "python/ray/_raylet.pyx", line 770, in ray._raylet.task_execution_handler
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "python/ray/_raylet.pyx", line 591, in ray._raylet.execute_task
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "python/ray/_raylet.pyx", line 629, in ray._raylet.execute_task
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "python/ray/_raylet.pyx", line 636, in ray._raylet.execute_task
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "python/ray/_raylet.pyx", line 640, in ray._raylet.execute_task
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "decision_tree/cart_with_tree.py", line 305, in best_split_for_idx_remote
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "decision_tree/cart_with_tree.py", line 266, in best_split_for_idx
(best_split_for_idx_remote pid=288, ip=172.31.45.66)   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 429, in sigterm_handler
(best_split_for_idx_remote pid=288, ip=172.31.45.66)     sys.exit(1)
(best_split_for_idx_remote pid=288, ip=172.31.45.66) SystemExit: 1
(best_split_for_idx_remote pid=2551, ip=172.31.42.15) 2021-12-28 04:20:36,444   ERROR worker.py:432 -- SystemExit was raised from the worker.
(best_split_for_idx_remote pid=2551, ip=172.31.42.15) Traceback (most recent call last):
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "python/ray/_raylet.pyx", line 770, in ray._raylet.task_execution_handler
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "python/ray/_raylet.pyx", line 591, in ray._raylet.execute_task
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "python/ray/_raylet.pyx", line 629, in ray._raylet.execute_task
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "python/ray/_raylet.pyx", line 636, in ray._raylet.execute_task
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "python/ray/_raylet.pyx", line 640, in ray._raylet.execute_task
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "decision_tree/cart_with_tree.py", line 305, in best_split_for_idx_remote
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "decision_tree/cart_with_tree.py", line 266, in best_split_for_idx
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 429, in sigterm_handler
(best_split_for_idx_remote pid=2551, ip=172.31.42.15)     sys.exit(1)
(best_split_for_idx_remote pid=2551, ip=172.31.42.15) SystemExit: 1

Seems

Versions / Dependencies

latest ray on product.

Reproduction script

N/A

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
rkooo567commented, Jan 4, 2022

Maybe we should try reproducing it?

0reactions
scv119commented, Feb 17, 2022

Hasn’t have a reproduction since Jan 3rd. Probably not worth investigating at the moment.

Read more comments on GitHub >

github_iconTop Results From Across the Web

2.09 MB - Hugging Face
Ġc sv LOC AL orizont al V S i command Ġth ree G lobal st and Ġe tree ... { K ind x...
Read more >
2m-subdomains.txt - Index of /
... ap AtomViewer mingw-w64-i686-gcc jre buildtools run QtEsrc Java manual ... replication MeMeSDK decision-tree-vis 2014-11-07 mnisqm PerDiem corporate ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found