question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Doc unclear on actual usage scenarios that will work

See original GitHub issue

I’ve been trying to use Dask-kubernetes to have a local Jupyter notebook on my laptop and run the Dask workers run on a Google Cloud Kubernetes cluster. I’ve been struggling a bit with the doc to figure out if this is supported, and how to do this, although it may not help that I’m learning Kubernetes at the same time. Then I found this https://github.com/dask/dask-kubernetes/issues/58#issuecomment-377811223 , which made everything a lot clearer: “It [Dask-kubernetes] is designed to be run from a pod on a Kubernetes cluster that has permissions to launch other pods.”

So this issue is about clarifying the doc, to help people like me in the future. I’m willing to do the pull request, but we should probably figure out if my understand is accurate first. On top of something like the above sentence (on the doc, and KubeCluster docstring), here are the things that would have been useful for me to have the doc clarify:

  • kubectl doesn’t need to be installed (because the Kubernetes Python API is used directly)… pretty sure this is true also…?
  • If .kube/config is configured correctly to be able to launch pods, it will be used, and workers will be launched on the default context. (Is that correct? If yes, it would allow the use case I’m looking for. If not, is there a way to influence the Kubernetes cluster used by Dask-kubernetes?)

If the second bullet is supported, it would also be nice to be able to specify a Kubernetes context other than the default, but that should probably be a separate issue…

Thanks!

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, Jul 5, 2018

Medium-term we might want to move the scheduler to a separate Pod to better improve situations like these

On Thu, Jul 5, 2018 at 12:10 PM, Jacob Tomlinson notifications@github.com wrote:

Hi @chrish42 https://github.com/chrish42 thanks for raising this.

The main place I use this is from Pangeo http://pangeo-data.org/ which runs your Jupyter Lab/Notebook in a pod on a kubernetes cluster. It also ensures that pod has permissions to launch other pods using service accounts. It also does not have kubectl installed so you are correct that it is not a requirement.

You are also right that dask-kubernetes should be able to create worker pods from any machine which has the kubernetes config configured correctly. This project wraps the kubernetes python module which can load credentials from a number of places.

I would hazard a guess that the problem you are experiencing is that your Jupyter Notebook is running on a different network to your workers. When the worker pods are created on kubernetes they try and connect back to the scheduler, which is running in your Notebook. As you are on different networks this will not be possible without a VPN setup. We do not experience this issue on Pangeo as the notebooks are in the cluster along with the workers.

It would be great if you could raise a PR to help others understand the conditions and limitations.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/dask-kubernetes/issues/82#issuecomment-402774510, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszA-wSC5fK2yeWXsyUKQe1a_RGLL1ks5uDjqKgaJpZM4VEInh .

0reactions
chrish42commented, Jul 9, 2018

Thanks for the merge. I’ve opened #84 about spawning the Dask scheduler on a separate pod, and #85 also for a more explicit way to specify the Kubernetes cluster. I’d be open to help with those, but I would probably need some guidance. But we can continue the discussion in those other issues. Thanks again!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Acceptance Criteria: Purposes, Types, Examples and Best ...
Scenario -oriented acceptance criteria; Rule-oriented acceptance criteria format; Other formats. Ready-to-use acceptance criteria templates ...
Read more >
How to Test an Application without Requirements?
Here are top 3 methods to test an application without requirements: Method #1: Work with whatever little documentation you can get your ...
Read more >
Use Cases - what Every Project Manager Should Know - PMI
This paper addresses the complexities of gathering ambiguous requirements, showing how use cases can help solve this problem. It also answers questions ...
Read more >
How to Write Test Cases Without Requirements - Mindful QA
How do you write test cases without requirements? In a fast-paced Agile process, you won't always have them. Learn how to write test...
Read more >
Business Scenarios - The Open Group Publications Catalog
The use of business scenarios by an IT customer can be an important aid to IT vendors in delivering appropriate solutions. Vendors need...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found