BUG: Deadlock with ray.tune since modin version 0.16.0
See original GitHub issueModin version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest released version of Modin.
-
I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)
Reproducible Example
# %%
import modin.pandas as pd
import numpy as np
import ray
from ray import tune
if not ray.is_initialized():
try:
from ray import air
ray.init(num_cpus=4, runtime_env={"env_vars": {"__MODIN_AUTOIMPORT_PANDAS__": "1"}})
except (TypeError, ImportError):
ray.init(num_cpus=4)
# Create a df with a numpy array in each cell, e.g. usage for timeseries
################################################################################
num_timeseries = 100
rand_int = np.random.randint(0, 10, size=(num_timeseries))
rand_float = np.random.random_sample((num_timeseries))
num_rows = 1000
df = pd.DataFrame({"abc": [rand_int] * num_rows, "def": [rand_float] * num_rows})
df
# %%
# Run df operations in a hyperparameter tuning experiment.
# It only fails/get stuck for me because of the series_diff and series_sum operations.
# Apply operations, as many as is like, are not problem.
################################################################################
def easy_objective(config, data):
df = data[0]
column = "abc"
for column in df.columns:
series_min = df[column].apply(np.nanmin)
series_max = df[column].apply(np.nanmax)
series_diff = series_max - series_min
series_sum = series_max + series_min
# Using the old api, as this api can be used in ray version 1.* and 2.*
tune.run(
tune.with_parameters(easy_objective, data=[df]),
num_samples=10,
resources_per_trial=tune.PlacementGroupFactory([{
"CPU": 1,
"GPU": 0
}, {
"CPU": 1
}], strategy="PACK"),
)
Issue Description
Running the very simple python snippet for model training via ray.tune on timeseries, the trials remain in RUNNING status and never finish. Sometimes, this is only a single trial, sometimes even more. This issue was introduced via version modin==0.16.0 and pandas==1.5.0. CPU workload is high at the beginning but drops down very quickly. Running this experiment several times and exiting it after not finishing, increases unused memory
Possibly related issue:
Expected Behavior
The run should finish within 1min (for it finished within 15sec). ray.tune runs shouldn’t get stuck in state RUNNNING.
Error Logs
2022-11-21 14:27:01,295 INFO services.py:1456 -- View the Ray dashboard at http://127.0.0.1:8283
UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
UserWarning: Distributing <class 'dict'> object. This may take some time.
2022-11-21 14:27:02,315 WARNING function_runner.py:598 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.
2022-11-21 14:27:02,359 WARNING tune.py:636 -- Tune detects GPUs, but no trials are using GPUs. To enable trials to use GPUs, set tune.run(resources_per_trial={'gpu': 1}...) which allows Tune to expose 1 GPU to each trial. You can also override `Trainable.default_resource_request` if using the Trainable API.
2022-11-21 14:27:02,488 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00000
== Status ==
Current time: 2022-11-21 14:27:03 (running for 00:00:01.18)
Memory usage on this node: 25.2/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (9 PENDING, 1 RUNNING)
+----------------------------+----------+-----------------------+
| Trial name | status | loc |
|----------------------------+----------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | PENDING | |
| easy_objective_2f0a2_00002 | PENDING | |
| easy_objective_2f0a2_00003 | PENDING | |
| easy_objective_2f0a2_00004 | PENDING | |
| easy_objective_2f0a2_00005 | PENDING | |
| easy_objective_2f0a2_00006 | PENDING | |
| easy_objective_2f0a2_00007 | PENDING | |
| easy_objective_2f0a2_00008 | PENDING | |
| easy_objective_2f0a2_00009 | PENDING | |
+----------------------------+----------+-----------------------+
2022-11-21 14:27:03,546 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00001
(easy_objective pid=460969) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
(easy_objective pid=461108) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00001 completed. Last result:
2022-11-21 14:27:06,996 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00002
(easy_objective pid=461403) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00002 completed. Last result:
== Status ==
Current time: 2022-11-21 14:27:09 (running for 00:00:06.70)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (7 PENDING, 1 RUNNING, 2 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00003 | PENDING | |
| easy_objective_2f0a2_00004 | PENDING | |
| easy_objective_2f0a2_00005 | PENDING | |
| easy_objective_2f0a2_00006 | PENDING | |
| easy_objective_2f0a2_00007 | PENDING | |
| easy_objective_2f0a2_00008 | PENDING | |
| easy_objective_2f0a2_00009 | PENDING | |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
+----------------------------+------------+-----------------------+
2022-11-21 14:27:10,016 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00003
(easy_objective pid=461747) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00003 completed. Last result:
2022-11-21 14:27:13,033 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00004
(easy_objective pid=462058) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00004 completed. Last result:
== Status ==
Current time: 2022-11-21 14:27:15 (running for 00:00:12.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (5 PENDING, 1 RUNNING, 4 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00005 | PENDING | |
| easy_objective_2f0a2_00006 | PENDING | |
| easy_objective_2f0a2_00007 | PENDING | |
| easy_objective_2f0a2_00008 | PENDING | |
| easy_objective_2f0a2_00009 | PENDING | |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
+----------------------------+------------+-----------------------+
2022-11-21 14:27:16,010 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00005
(easy_objective pid=462263) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00005 completed. Last result:
2022-11-21 14:27:19,009 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00006
(easy_objective pid=462614) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00006 completed. Last result:
== Status ==
Current time: 2022-11-21 14:27:21 (running for 00:00:18.70)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (3 PENDING, 1 RUNNING, 6 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00007 | PENDING | |
| easy_objective_2f0a2_00008 | PENDING | |
| easy_objective_2f0a2_00009 | PENDING | |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
+----------------------------+------------+-----------------------+
2022-11-21 14:27:22,006 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00007
(easy_objective pid=462811) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00007 completed. Last result:
2022-11-21 14:27:25,009 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00008
(easy_objective pid=463103) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00008 completed. Last result:
== Status ==
Current time: 2022-11-21 14:27:27 (running for 00:00:24.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 PENDING, 1 RUNNING, 8 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00009 | PENDING | |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
+----------------------------+------------+-----------------------+
2022-11-21 14:27:28,025 INFO trial_runner.py:803 -- starting easy_objective_2f0a2_00009
(easy_objective pid=463411) UserWarning: When using a pre-initialized Ray cluster, please ensure that the runtime env sets environment variable __MODIN_AUTOIMPORT_PANDAS__ to 1
Trial easy_objective_2f0a2_00009 completed. Last result:
== Status ==
Current time: 2022-11-21 14:27:35 (running for 00:00:32.71)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:27:40 (running for 00:00:37.71)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:27:45 (running for 00:00:42.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:27:50 (running for 00:00:47.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:27:55 (running for 00:00:52.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:00 (running for 00:00:57.72)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:05 (running for 00:01:02.73)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:10 (running for 00:01:07.73)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:15 (running for 00:01:12.73)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:20 (running for 00:01:17.73)
Memory usage on this node: 25.4/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:25 (running for 00:01:22.74)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:30 (running for 00:01:27.74)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
^C2022-11-21 14:28:32,098 WARNING tune.py:650 -- SIGINT received (e.g. via Ctrl+C), ending Ray Tune run. This will try to checkpoint the experiment state one last time. Press CTRL+C one more time (or send SIGINT/SIGKILL/SIGTERM) to skip.
== Status ==
Current time: 2022-11-21 14:28:35 (running for 00:01:32.74)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
== Status ==
Current time: 2022-11-21 14:28:35 (running for 00:01:32.74)
Memory usage on this node: 25.3/62.6 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/4 CPUs, 0/1 GPUs, 0.0/23.86 GiB heap, 0.0/11.93 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/andreas/ray_results/easy_objective_2022-11-21_14-27-02
Number of trials: 10/10 (1 RUNNING, 9 TERMINATED)
+----------------------------+------------+-----------------------+
| Trial name | status | loc |
|----------------------------+------------+-----------------------|
| easy_objective_2f0a2_00000 | RUNNING | 192.168.178.41:460969 |
| easy_objective_2f0a2_00001 | TERMINATED | 192.168.178.41:461108 |
| easy_objective_2f0a2_00002 | TERMINATED | 192.168.178.41:461403 |
| easy_objective_2f0a2_00003 | TERMINATED | 192.168.178.41:461747 |
| easy_objective_2f0a2_00004 | TERMINATED | 192.168.178.41:462058 |
| easy_objective_2f0a2_00005 | TERMINATED | 192.168.178.41:462263 |
| easy_objective_2f0a2_00006 | TERMINATED | 192.168.178.41:462614 |
| easy_objective_2f0a2_00007 | TERMINATED | 192.168.178.41:462811 |
| easy_objective_2f0a2_00008 | TERMINATED | 192.168.178.41:463103 |
| easy_objective_2f0a2_00009 | TERMINATED | 192.168.178.41:463411 |
+----------------------------+------------+-----------------------+
^CTraceback (most recent call last):
File "bug_report_example.py", line 41, in <module>
tune.run(
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/tune/tune.py", line 686, in run
runner.cleanup()
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 1382, in cleanup
self.cleanup_trials()
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 1378, in cleanup_trials
self.trial_executor.cleanup(self.get_trials())
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 747, in cleanup
ready, _ = ray.wait(list(self._futures.keys()), timeout=0)
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/andreas/miniconda3/envs/elise/lib/python3.8/site-packages/ray/worker.py", line 1993, in wait
ready_ids, remaining_ids = worker.core_worker.wait(
File "python/ray/_raylet.pyx", line 1401, in ray._raylet.CoreWorker.wait
File "python/ray/_raylet.pyx", line 167, in ray._raylet.check_status
KeyboardInterrupt
Installed Versions
INSTALLED VERSIONS
commit : e50cec12655b4a2a3decf342ab45433080d3c023 python : 3.8.10.final.0 python-bits : 64 OS : Linux OS-release : 5.15.0-10052-tuxedo Version : #58~20.04.1tux1 SMP Mon Oct 24 12:05:31 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
Modin dependencies
modin : 0.17.0 ray : 1.12.1 dask : None distributed : None hdk : None
pandas dependencies
pandas : 1.5.1 numpy : 1.23.4 pytz : 2022.6 dateutil : 2.8.2 setuptools : 65.5.0 pip : 21.1.3 Cython : None pytest : 7.2.0 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.6.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : None fastparquet : None fsspec : 2022.11.0 gcsfs : None matplotlib : 3.5.3 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : 1.3.24 tables : None tabulate : 0.9.0 xarray : None xlrd : None xlwt : None zstandard : None tzdata : None
Issue Analytics
- State:
- Created 10 months ago
- Comments:10 (9 by maintainers)
I don’t think there’s any good way to run arbitrary ray tasks that operate on Modin dataframe objects because ray can’t infer all the dependencies as I pointed out here.
cc @modin-project/modin-contributors @modin-project/modin-core @modin-project/modin-ray in case anyone thinks otherwise. If there’s no response in 48 hours, I’ll close the issue.
Hi @ahallermed! Thank you for opening this issue - I’m tagging @mvashishtha since he has some experience with ray deadlocking (as the author of the previous issue you tagged)!