question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KubeCluster loses the ability to run commands against a deployed cluster after periods of inactivity.

See original GitHub issue

Cluster is created as:

cluster = KubeCluster(pod_template=worker_pod, scheduler_pod=sched_pod, deploy_mode='remote')

What happened: Initially, all API commands run against the cluster, such as ‘scale’, ‘close’, etc… function as expected. After some period of time, the cluster object fails to authenticate with the cluster and all API commands return a failed status due to being ‘unauthorized’.

site-packages/dask_kubernetes/core.py", line 81, in close
    await self.core_api.delete_namespaced_pod(name, namespace)
...
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}

What you expected to happen: I would expect the cluster object to re-authenticate, or provide some mechanism to re-attach to the created cluster.

# Put your MCVE code here

Anything else we need to know?:

Environment: Environment is a GKE cluster. Prior to losing the ability to authenticate, everything else functions as expected.

  • Dask version: 2.30.0
  • Python version: 3.8.6
  • Operating System: GKE + CoS
  • Install method (conda, pip, source):

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
drobison00commented, Feb 2, 2021

@jacobtomlinson Any updates on this? If not, do you have any intuition on what’s preventing the re-auth from occurring? If I get some extra time I could take a look.

0reactions
jacobtomlinsoncommented, Feb 23, 2021

Sorry I’ve been out for a while and this is currently on the backlog.

If you have some time to investigate it would be much appreciated.

Read more comments on GitHub >

github_iconTop Results From Across the Web

KubeCluster (classic) - Dask Kubernetes
KubeCluster deploys Dask clusters on Kubernetes clusters using native Kubernetes APIs. It is designed to dynamically launch ad-hoc deployments.
Read more >
Kube Cluster Nodes Test - eG Innovations
Load Balancer / Master Node IP. To run this test and report metrics, the eG agent needs to connect to the Kubernetes API...
Read more >
Debugging Your Kubernetes Nodes in the 'Not Ready' State
In this article, you'll learn a few possible reasons why a node might enter the NotReady state and how you can debug it....
Read more >
Install kubectl and configure cluster access - Google Cloud
Run kubectl commands against a specific cluster using the --cluster flag. View kubeconfig. To view your environment's kubeconfig , run the following command:....
Read more >
Teleport CLI Reference | Teleport Docs
Detailed guide and reference documentation for Teleport's command line ... joining the cluster # serviced by the auth server running on 10.1.0.1 sudo ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found