Optimized Memoryless Fair-Share HPC Resources Scheduling using Transparent Checkpoint-Restart Preemption
Kfir Zvi, Gal Oren

TL;DR
This paper introduces a novel memoryless fair-share scheduling method for HPC resources that uses transparent checkpoint-restart preemption to improve system utilization and fairness without additional costs or penalties.
Contribution
It presents a new scheduling approach that enhances resource allocation efficiency and fairness in supercomputing by leveraging transparent checkpoint-restart preemption.
Findings
Increased system utilization and fairness.
No additional costs or penalties for users.
Enhanced resource allocation efficiency.
Abstract
Common resource management methods in supercomputing systems usually include hard divisions, capping, and quota allotment. Those methods, despite their 'advantages', have some known serious disadvantages including unoptimized utilization of an expensive facility, and occasionally there is still a need to dynamically reschedule and reallocate the resources. Consequently, those methods involve bad supply-and-demand management rather than a free market playground that will eventually increase system utilization and productivity. In this work, we propose the newly Optimized Memoryless Fair-Share HPC Resources Scheduling using Transparent Checkpoint-Restart Preemption, in which the social welfare increases using a free-of-cost interchangeable proprietary possession scheme. Accordingly, we permanently keep the status-quo in regard to the fairness of the resources distribution while maximizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
