Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why is the walltime is set by default?

See original GitHub issue

I was wondering why the default walltime for workers is set to 30 minutes instead of None?

For example, see https://github.com/dask/dask-jobqueue/blob/1f9ae1ecc79a5b76930e56c8801f3b8ace659877/dask_jobqueue/jobqueue.yaml#L71

I kept having failed jobs for 2 days until I realized that it’s actually this library is setting a hard limit for my jobs, why? This was very hard to debug among all the other errors that I was trying to fix.

If you want your users to be aware of walltime, isn’t there a better solution than setting it to a random number? Can’t you make it a required variable? Can’t the default value be something like "not-set-by-user" and complain if it’s not set?

Issue Analytics

State:
Created 2 years ago
Comments:15 (11 by maintainers)

Top GitHub Comments

3reactions

jacobtomlinsoncommented, Jun 28, 2021

I’m definitely in favour of documenting it.

I am also in favour of logging output to the user when we create cluster managers for them. In dask-cloudprovider we log some information about the cluster object when it gets instantiated and that information is also available at cluster.get_logs(). We log things like VM type, docker image, region, etc. I think other cluster managers should start implementing this too.

1reaction

guillaumeebcommented, Jul 3, 2021

OK, what you say makes complete sense!

So to fix this issues, we should both :

update documentation to inform about the default setting (and explaining it a bit),
log some output with the jobqueue cluster main settings by implementing the _log method of the Cluster object.

Top Results From Across the Web

TORQUE/PBS Config - Default Queue Settings - Adaptive

Queue Default Node and Walltime Attributes. To set a default of one node and 15 minutes of walltime for a particular queue, issue...

Specifying job resources - VSC documentation

Specifying job resources¶. Resources are specified using the -l option. Typically, three resources will be specified: walltime. number of nodes and cores.

Important custom configs for Torque/Maui

Torque do these via qmgr - for each queue do set queue short ... for each queue set queue short acl_groups += cmexp...

is there a way to change the walltime for currently running ...

This is generally to avoid malicious users from requesting a small amount of time so they get queued quickly and then changing it...

Biowulf User Guide - NIH HPC

Your /home, /data and shared space is set up exactly the same on Helix and ... the walltime set for the job will...