Proposal: `NamedCluster`
See original GitHub issueHi! First, huge thanks for the work in this repo - it has made working on clusters so easy.
Currently, I must know in advance which type of cluster a user will be running on to decide which Cluster
object (slurm/sge etc) to instantiate, the appropriate values are then filled from the config file.
Would you consider a PR adding a NamedCluster
class? The idea is quite simple
- add a
jobqueue-named-clusters
section to the existing cluster config specification - make
scheduler
a key at the same level as the rest of the entries in existing cluster config
this would look something like
jobqueue-named-clusters:
my-awesome-cluster:
scheduler: 'slurm'
name: dask-worker
cores: 36 # Total number of cores per job
memory: '109 GB' # Total amount of memory per job
processes: 9 # Number of Python processes per job
interface: ib0 # Network interface to use like eth0 or ib0
queue: regular
walltime: '00:30:00'
resource-spec: select=1:ncpus=36:mem=109GB
NamedCluster
would then dispatch to the appropriate Cluster
class based on the value found in ‘scheduler’
The use case I envision is providing daskified programs for non-python users to run in HPC environments - I would be able to use NamedCluster
liberally in my programs across projects and leave all cluster config to the end user in a config file
from dask_jobqueue import NamedCluster
cluster = NamedCluster("my-awesome-cluster")
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:9 (5 by maintainers)
Top GitHub Comments
From my side I’m very happy with simple loading from a
yaml
file, thanks for taking the time to play with my use case and provide a nice solution!This proposal sounds really interesting, it’s great to hear folks asking for this.
I just want to weigh in and say this is exactly the kind of thing we are trying to solve with dask-ctl and we already have a similar feature over there. For any cluster manager that supports
dask-ctl
you can create a YAML spec file and create the cluster from the command line. @ian-r-rose and I have also been discussing replacing the cluster spawning tooling in the Jupyter Lab Extension with this.I would caution against implementing this in
dask-jobqueue
and instead suggest that we adddask-ctl
support todask-jobqueue
and reuse the functionality from there.What do you think @alisterburt, @guillaumeeb?