DMRlib: Easy-coding and Efficient Resource Management for Job Malleability
Sergio Iserte, Rafael Mayo, Enrique S. Quintana-Ort\'i, Antonio J. Pe\~na

TL;DR
DMRlib is a library that simplifies the development of malleable applications, enabling efficient resource management and significantly improving throughput in data centers.
Contribution
It introduces a minimalist MPI-like library with predefined communication patterns to facilitate process malleability adoption.
Findings
Resource allocation rate increased by over 3x with malleability.
Malleable jobs improved energy efficiency and throughput.
Demonstrated positive impact across various scalability scenarios.
Abstract
Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible additional development effort this solution imposes has constrained its adoption by the scientific programming community. In this work, we present DMRlib, a library designed to offer the global advantages of process malleability while providing a minimalist MPI-like syntax. The library includes a series of predefined communication patterns that greatly ease the development of malleable applications. In addition, we deploy several scenarios to demonstrate the positive impact of process malleability featuring different scalability patterns. Concretely, we study two job submission modes (rigid and moldable) in order to identify the best-case scenarios for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
