Why is the walltime is set by default?
See original GitHub issueI was wondering why the default walltime for workers is set to 30 minutes instead of None?
For example, see https://github.com/dask/dask-jobqueue/blob/1f9ae1ecc79a5b76930e56c8801f3b8ace659877/dask_jobqueue/jobqueue.yaml#L71
I kept having failed jobs for 2 days until I realized that it’s actually this library is setting a hard limit for my jobs, why? This was very hard to debug among all the other errors that I was trying to fix.
If you want your users to be aware of walltime, isn’t there a better solution than setting it to a random number?
Can’t you make it a required variable?
Can’t the default value be something like "not-set-by-user"
and complain if it’s not set?
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (11 by maintainers)
Top Results From Across the Web
TORQUE/PBS Config - Default Queue Settings - Adaptive
Queue Default Node and Walltime Attributes. To set a default of one node and 15 minutes of walltime for a particular queue, issue...
Read more >Specifying job resources - VSC documentation
Specifying job resources¶. Resources are specified using the -l option. Typically, three resources will be specified: walltime. number of nodes and cores.
Read more >Important custom configs for Torque/Maui
Torque do these via qmgr - for each queue do set queue short ... for each queue set queue short acl_groups += cmexp...
Read more >is there a way to change the walltime for currently running ...
This is generally to avoid malicious users from requesting a small amount of time so they get queued quickly and then changing it...
Read more >Biowulf User Guide - NIH HPC
Your /home, /data and shared space is set up exactly the same on Helix and ... the walltime set for the job will...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m definitely in favour of documenting it.
I am also in favour of logging output to the user when we create cluster managers for them. In
dask-cloudprovider
we log some information about the cluster object when it gets instantiated and that information is also available atcluster.get_logs()
. We log things like VM type, docker image, region, etc. I think other cluster managers should start implementing this too.OK, what you say makes complete sense!
So to fix this issues, we should both :
_log
method of the Cluster object.