Improving the Effective Utilization of Supercomputer Resources by Adding Low-Priority Containerized Jobs
Julia Dubenskaya, Stanislav Polyakov

TL;DR
This paper presents a system that adds low-priority containerized jobs to supercomputers, effectively increasing resource utilization by managing idle resources with container migration and scheduling integration.
Contribution
It introduces a container management system that efficiently utilizes idle supercomputer resources through low-priority containerized jobs and migration tools.
Findings
Increased resource utilization in simulations
Significant performance improvements in some scenarios
Effective management of low-priority jobs
Abstract
We propose an approach to utilize idle computational resources of supercomputers. The idea is to maintain an additional queue of low-priority non-parallel jobs and execute them in containers, using container migration tools to break the execution down into separate intervals. We propose a container management system that can maintain this queue and interact with the supercomputer scheduler. We conducted a series of experiments simulating supercomputer scheduler and the proposed system. The experiments demonstrate that the proposed system increases the effective utilization of supercomputer resources under most of the conditions, in some cases significantly improving the performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Parallel Computing and Optimization Techniques
