[Ray general] Tons of irrelevant messages are sent to nodes very frequently once starting ray on cluster.
See original GitHub issueWhat is your question?
When I use ray start --head --redis-port=6379
to start Ray on a head node (IP: 10.188) and then run ray start --address='192.168.10.188:6379' --redis-password='5241590000000000'
on other two nodes (IP: 10.94 and 10.181), I notice something like this photo below where I captured some packets:
My first question is why the head node (IP: 10.188) sends to 10.94 packets that include info for 10.181? And this also happens on those packets sent to 10.188 but include info for 10.94.
My second question is who and why a head node always sends this kind of message
to all nodes within a very short time even though there’s no one task? What’s the meaning of these messages?
I guess this may be a polling mechanism, but why this need to do that so frequently?
In my test, I tried to connect 500 private nodes to one head node, and then they consumed lots of network bandwidth. As I said, tons of irrelevant messages were sent to nodes very frequently.
Any help is really appreciated.
Ray version and other system information (Python version, TensorFlow version, OS): Ray 0.8.4 Python 3.6.9
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (8 by maintainers)
Top GitHub Comments
COOL!!! It works! Thank u so much!
Yes, you can do that! It’s a bit ugly, but you just have to pass in a flag like this to the
ray start
command (make sure to do it on both the head and worker nodes):