A batch scheduler with high level components
Nicolas Capit (ID - Imag, Inria Rh\^one-Alpes / Id-Imag), Georges Da, Costa (ID - Imag, Inria Rh\^one-Alpes / Id-Imag), Yiannis Georgiou (ID -, Imag, Inria Rh\^one-Alpes / Id-Imag), Guillaume Huard (ID - Imag, Inria, Rh\^one-Alpes / Id-Imag), Cyrille Martin (ID - Imag)

TL;DR
This paper introduces OAR, a batch scheduler for large clusters that uses high-level tools like Perl and MySQL to achieve efficiency, scalability, and robustness without complex software design.
Contribution
The paper demonstrates that a complex resource management system can be built with high-level tools without sacrificing performance or scalability.
Findings
OAR manages 700 nodes effectively in a metropolitan GRID.
Performance is comparable to other batch schedulers despite using high-level tools.
The system offers features like priority scheduling, reservations, and backfilling.
Abstract
In this article we present the design choices and the evaluation of a batch scheduler for large clusters, named OAR. This batch scheduler is based upon an original design that emphasizes on low software complexity by using high level tools. The global architecture is built upon the scripting language Perl and the relational database engine Mysql. The goal of the project OAR is to prove that it is possible today to build a complex system for ressource management using such tools without sacrificing efficiency and scalability. Currently, our system offers most of the important features implemented by other batch schedulers such as priority scheduling (by queues), reservations, backfilling and some global computing support. Despite the use of high level tools, our experiments show that our system has performances close to other systems. Furthermore, OAR is currently exploited for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
