[autoscaler] Remote execution gets slower, much space used on head node
See original GitHub issueRay 1.0.1 Python 3.7.6 on conda Ubuntu 18.04
After a few hours running remote functions, remote functions take much time than the beginning.
If I ray stop on head node and ray up xx.yaml on driver node, this issue disappears. And it happens again after a few hours.
Also /tmp/ray takes much storage if head node is running quite long time. Is there any command to clean unnecessary files?
I just use rm -rf /tmp/ray at the moment.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Autoscaling clusters with Ray - Anyscale
The head node is special because it will be managing the cluster through the Ray Autoscaler: it will be responsible for syncing files...
Read more >Understanding Kubernetes Autoscaling - Scaleway's Blog
Horizontal Scaling means modifying the compute resources of an existing cluster, for example, by adding new nodes to it or by adding new...
Read more >Cluster Autoscaler: How It Works and Solving Common ...
Pending Nodes Exist But Cluster Does Not Scale Up ; All suitable node groups are at maximum size. Increase the maximum size of...
Read more >Fix common cluster issues | Elasticsearch Guide [8.5] | Elastic
This error indicates a data node is critically low on disk space and has reached the flood-stage disk usage watermark. Circuit breaker errors:...
Read more >Autoscaling - Amazon EKS - AWS Documentation
Insufficient Capacity Errors occur whenever your Amazon EC2 Auto Scaling group can't scale up due to a lack of available capacity. Selecting many...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Do you think it is possible to create a reproducible script?
About the logs size, you will be able to configure log rotation very soon (within a couple weeks).