Support SGE scheduler
See original GitHub issueIt’d be great to support SGE as scheduler. I’m trying to use PBS as they are somewhat similar, but polling doesn’t seem to really work: at
https://github.com/eth-cscs/reframe/blob/45fbacb23210c757724882a13c4a53f33af04800/reframe/core/schedulers/pbs.py#L186
the command qstat -f Your
is executed for me, I guess there is something wrong going on.
I might be able to work on this implementation in the near future, it shouldn’t be too different from PBS, but I’ll likely need some assistance along the way 🙂
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Administering the Scheduler (Sun N1 Grid Engine 6.1 ...
This section describes how the grid engine system schedules jobs for execution. The section describes different types of scheduling strategies and explains ...
Read more >SGE Job Scheduler | Arts & Sciences Computing
Sun Grid Engine (SGE) is a tool for resource management and load balancing in cluster environments. Running a batch job with SGE begins...
Read more >Sun Grid Engine (SGE) QuickStart — StarCluster 0.93.3 ...
Scheduling - allows you to schedule a virtually unlimited amount of work to be performed when resources become available. · Load Balancing -...
Read more >SGE Manual Pages - Open Grid Scheduler
NAME Sun Grid Engine - a facility for executing UNIX jobs on remote machines ... User level checkpointing programs are supported and a...
Read more >Sun Grid Engine -- A Batch System - Talby
Scheduler, Queues and Slots: SGE includes both a scheduler for allocating resources (CPUs!) to computational jobs and a queueing mechanism.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Actually, there is no need to look outside the scheduler backends. To implement a scheduler backend you only need to implement the interface of the
JobScheduler
abstract class. As soon as you do this, and you decorate your scheduler class with@register_scheduler(name)
it should be integrated with the framework and ready to be used. Job schedulers manageJob
instances, which are the job descriptors and contain the information about the job has been submitted. The rest of the framework does not know about the backends at all. The framework will callJob.create()
to create a new job with all the information retrieved by the test spec. This in turn will call themake_job()
method of the scheduler backend.Job
is not abstract, but backends may choose to extend it just to add additional fields relevant for them, see for example the_PbsJob
. The rest of theJobScheduler
API takes either a single job or a list of jobs to process. If you look into thereframe.core.schedulers
module you will see the documentation for each API function. Practically, for a scheduler backend, the most important methods are theemit_preamble(job)
, thesubmit(job)
, thecancel(job)
and thepoll(*jobs)
. Thepoll
method may take multiple jobs at once, because it is more efficient to issue a single poll command and retrieve the state of multiple jobs instead of polling each one individually. How is this going to be implemented is entirely up to the backend.And a small correction to the documentation of the API. The following is not correct:
https://github.com/eth-cscs/reframe/blob/45fbacb23210c757724882a13c4a53f33af04800/reframe/core/schedulers/__init__.py#L86-L95
The
finished()
method does not poll (this is a stale comment 😬 ). Thepoll()
method polls andfinished()
simply retrieves the job state (whether a job has finished or not) and raises any job-related error that has happened during polling.The
filternodes()
andallnodes()
methods are not essential (see PBS backend) unless you want your backend to support flexible jobs (see here).That was helpful, thanks! When submitting a job with SGE I get
I’m now going through the other changes of the syntax.