Allow to start a Scheduler in a batch job
See original GitHub issueOne of the goal of ClusterManager
object is to be able to launch a remote scheduler. In dask-jobqueue scope, this probably means submitting a job which will start a Scheduler, and then connect to it.
We probably still lacks some remote interface between ClusterManager
and scheduler
object for this to work, so it will probably mean to extend APIs upstream.
Identified Scheduler method to provide:
- retire_workers(n, minimum, key)
- scheduler_info(), already existing, see if sufficient,
- add_diagnostic_plugin(), and mostly retrive plugin information remotely
I suspect that adaptive will need to change significantly too, this will maybe lead to having a transitional adaptive logic in dask-jobqueue, and other remote function to add in scheduler.
This is in the scope of #170.
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (10 by maintainers)
Top Results From Across the Web
How to Trigger and Stop a Scheduled Spring Batch Job
Firstly, we have a class SpringBatchScheduler to configure scheduling and batch job. A method launchJob() will be registered as a scheduled task ...
Read more >How to schedule a Batch File to run automatically on Windows ...
Create a Batch file; Open Task Scheduler; Create a Basic Task; Open Task Scheduler Library; Make Task runs with the highest privileges. Step...
Read more >Spring Batch Scheduling: A Comprehensive Guide 101 - Learn
Use a class SpringBatchScheduler to configure the scheduling of Spring Batch Jobs. A method called launchJob() will be registered as a scheduled ......
Read more >Schedule the Batch Job to Run - Salesforce Help
Select Schedule-Triggered Flow, and click Next. Drag the Action element onto the canvas. Select the Process Closed Cases batch job. Name the action...
Read more >Job scheduling - AWS Batch
You can set scheduling priority to configure the order that jobs are run in on a share identifier. Jobs with a higher scheduling...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I am currently using dask-joblib on a PBS cluster and running the scheduler on the login node. It is indeed a bit problematic because the login node has only 2gb of memory and it quickly runs out if I am not careful with the size of computation graphs.
So I think I would definitely benefit from this feature.
@muammar I see that you have commented in https://github.com/dask/dask-jobqueue/pull/390#issuecomment-603558844. Could you please explain the admin rules that are in place on your cluster just to get an idea what you are allowed to do on your cluster.
You may be interested by my answer above: https://github.com/dask/dask-jobqueue/issues/186#issuecomment-568265386. Let me try to some up:
Cluster
object) in an interactive job: probably easier. If you like to work in a Jupyter environment, this is doable this way. There are a few hoops to jump through (mostly SSH port forwarding to open your Jupyter notebook in your local browser atlocalhost:<some-port>
).Cluster
object) in a batch job. Only Python scripts not Jupyter environment.In both 1. and 2. you need to bear in mind that as soon as your scheduler job finishes, you will lose all your workers after ~60s. That may mean losing the result of lenghty computations.