question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Update cpu and memory requests after cluster creation

See original GitHub issue

On our Pangeo deployment users get a default worker-template.yaml which allows them to create clusters by simply running cluster = KubeCluster() without having to worry about what a kubernetes even is.

However in some occasions people want to be able to update the memory and cpu ratios of their workers depending on what they are running. The current workflow for this is to either specify the whole template as a dict or to copy the default worker-template.yaml, understand it, update the values and then user KubeCluster.from_yaml().

Personally I ended up writing a couple of helper functions in my notebook which look like this:

def update_worker_memory(cluster, new_limit):
    cluster.pod_template.spec.containers[0].resources.limits["memory"] = new_limit
    cluster.pod_template.spec.containers[0].resources.requests["memory"] = new_limit
    if '--memory-limit' in cluster.pod_template.spec.containers[0].args:
        index = cluster.pod_template.spec.containers[0].args.index('--memory-limit')
        cluster.pod_template.spec.containers[0].args[index + 1] = new_limit
    return cluster

def update_worker_cpu(cluster, new_limit):
    cluster.pod_template.spec.containers[0].resources.limits["cpu"] = new_limit
    cluster.pod_template.spec.containers[0].resources.requests["cpu"] = new_limit
    if '--nthreads' in cluster.pod_template.spec.containers[0].args:
        index = cluster.pod_template.spec.containers[0].args.index('--nthreads')
        cluster.pod_template.spec.containers[0].args[index + 1] = new_limit
    return cluster

This allows me to adjust the worker template after the cluster has been created and all new workers will follow the updated values.

I’m considering how to add this functionality into the core project. I’m inspired by the dask-jobqueue SLURMCluster which allows you to specify cores and memory as kwargs. Therefore perhaps @mrocklin, @jhamman or @guillaumeeb have thoughts.

Before I go charging in to raise a PR I would like to discuss options.

  • Would it be useful to add methods to the KubeCluster object to update sizes after creation as I am above?
  • Should we add kwargs to the cluster init and if so should they create the cluster and use the helpers or update the config before creation?
  • Are there any other ways of specifying memory and cpu that I haven’t captured in the examples above?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:15 (12 by maintainers)

github_iconTop GitHub Comments

2reactions
mrocklincommented, Nov 28, 2018

I think that this seems like a reasonable request. I think that it would require additional logic to the Adaptive class that looked at resources when making requests. It’s non-trivial work, but seems reasonably in-scope.

2reactions
guillaumeebcommented, Nov 28, 2018

We are thinking about this in https://github.com/dask/distributed/issues/2118 and https://github.com/dask/distributed/issues/2208#issuecomment-419140500.

But we didn’t talk about adaptive part yet (“scale them according to what the client requests”), which would probably need modifications too.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resource Management for Pods and Containers - Kubernetes
The most common resources to specify are CPU and memory (RAM); there are others. When you specify the resource request for containers in...
Read more >
Setting the right requests and limits in Kubernetes - Learnk8s
Requests affect how the pods are scheduled in Kubernetes. When a Pod is created, the scheduler finds the nodes which can accommodate the...
Read more >
Set Pod CPU and Memory Limits | Kubernetes
The limits enumerated in a namespace are only enforced when a pod is created or updated in the cluster. If you change the...
Read more >
Understanding Kubernetes Limits and Requests - Sysdig
Let's have a look at this deployment, where we are setting up limits and requests for two different containers on both CPU and...
Read more >
How to use resource requests and limits to manage ... - devmio
Cluster administrators can create namespaces for different teams and set ... Within the pod configuration file cpu and memory are each a resource...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found