Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[autoscaler] [kubernetes] Calling ray down does not remove Kubernetes services

See original GitHub issue

When creating a cluster on Kubernetes, Ray will allocate a service routing traffic to the head node when the user adds this to their cluster config:

services:
      - apiVersion: v1
        kind: Service
        metadata:
            name: local-cluster-ray-head
        spec:
            selector:
                component: local-cluster-ray-head
            ports:
                - name: client
                  protocol: TCP
                  port: 10001
                  targetPort: 10001
                - name: dashboard
                  protocol: TCP
                  port: 8265
                  targetPort: 8265

However, when calling ray down cluster.yaml, this service (unlike the pods) will not be removed. The expected behavior is that all resources created by ray up should be properly cleaned up after calling ray down.

cc @richardliaw

Issue Analytics

State:
Created 3 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

edoakescommented, Mar 16, 2021

@tgaddair yep, makes sense, a lot of users follow a similar workflow. Our intention is for this workflow to be done via the Ray client like:

ray.util.connect("k8s_service_address", working_dir="/local/path", py_modules=["/path/to/local/module"])

The files in working_dir would be available on every node in the working_dir of the tasks/actors and py_modules would be injected into the sys.path.

1reaction

tgaddaircommented, Mar 16, 2021

Nice! All of that sounds good. For my use case, I want to be able to make changes to my local Python code that are then reflected on both the head node and the workers. Right now I do this through the YAML when using ray up:

file_mounts:
    /home/ray/src: python

Here python is a directory containing my local Python files.

In general, I think being able to launch a Ray cluster using the operator, then interact with it via the ray CLI, including syncing local files, would be an ideal user experience for my workflow.