[Core] [Bug] Remote client environment is not setting up properly.
See original GitHub issueSearch before asking
- I searched the issues and found no similar issues.
Ray Component
Ray Core, Ray Clusters
What happened + What you expected to happen
Hi everyone, I have a Ray cluster deployed on Azure K8s. I connect to it using kubectl command as given in the documentation.
I initially ran the task in the local ray environment to test the scaling and working of the task.
Then, when I am trying to run put
an object to test on the cluster, it throws the following error:
Put failed:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-9-bd8872eca93e> in <module>
16 'combined_dfs': combined_dfs_ray,
17 }
---> 18 run_validation = validate_source.remote(config)
ModuleNotFoundError: No module named 'sklearn'
But sklearn is available in the local env. Also, for the algorithm we are running, we are using runtime_env option while running the ray.init command to make available our custom code. All the dependencies are already installed in the env.
LOCAL_PORT = 10001
ray.init(f"ray://127.0.0.1:{LOCAL_PORT}",
runtime_env={
"working_dir": "../src",
})
By the put
command, I was expecting an objectId
, thus telling that the object has been successfully transferred to the cluster.
Versions / Dependencies
I am using:
- Conda (Windows)
- Python 3.8.12
- Ray[default] 1.9.1
- Remote client on Azure k8s
Reproduction script
I am trying to workout a small code sample, but the problem is that the simple code structure and no external dependencies are working fine in the remote client environment. The error occurs when I am using our existing dev env. I am trying to create an example in the meanwhile. Also, If there are any logs which could help in providing information then please do let me know.
Anything else
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (6 by maintainers)
Top GitHub Comments
I only run it in my custom cluster implemented via
node_provider.py
and do not use helmand this is my config:
When running on the cluster you need to make sure the deps are also installed there. Don’t need to locally because everything is running in the same env (the local env).