Available queueing and scheduling systems

This page contains links to some commonly used queueing and scheduling systems. UNICC has not yet decided which queueing system it will adopt. The following summary together with the references will hopefully make the choice of queueing system easier.

There are (at least) two commercially available queueing systems: NQE/NQS and LSF. NQE/NQS has been developed by Cray Research and is available at no extra cost for us. However this does not necessarily imply that NQE is the "best" choice, as it is targeted at workstation clusters rather than shared memory computers. LSF is a competetive load sharing facility which has been developed by Platform Computing. It is currently used, among others, by Boston University for job scheduling on its SGI machines (a power challenge array and a new Origin system.)

The supercomputing centre at ANL was not satisfied with any of the commercially available systems (including IBM's own loadleveler), so David A. Lifka at ANL developed a new scheduling system called "The Extensible Argonne Scheduling System", EASY, which is able to schedule large parallel jobs together with serial ones optimally within one unified queue. EASY is currently used, among others, by ANL's High-Performance Computing Research Facility (on a 128 node SP2) and by Cornell Theory Center (on a 512 node SP2). EASY has also been adopted successfully by PDC at KTH. The source code (EASY is written in perl) is freely available.

SGI is working on a scheduling system called "miser". It looks very similar to the EASY queueing system, but miser will probably add functionality for job control within a shared memory computer (down to the kernel level, if neccessary.) It is likely that the existing product "Share" will be merged with "miser" later this year.

Finally I have included a reference to our own 'que' system, which was developed by Urban Engberg.


Lennart Bengtsson 1996-12-27