Towards the Democratization and Standardization of Dynamic Resources with MPI Spawning
Sergio Iserte, Iker Mart\'in-Alvarez, Krzystof Rojek, Jos\'e I. Aliaga, Maribel Castillo, and Antonio J. Pe\~na

TL;DR
This paper introduces a flexible, unified API and framework for dynamic resource management in HPC, improving reconfiguration capabilities and supporting malleable applications like MPDATA.
Contribution
It presents an enhanced modular DMR framework with Proteo reconfiguration engine, enabling diverse reconfiguration strategies without process respawning.
Findings
Improved reconfiguration support with Proteo engine
Enhanced modularity of DMR framework
Demonstrated performance with MPDATA application
Abstract
This paper presents an efficient tool for managing dynamic resources in production high-performance computing (HPC) settings, focusing on flexibility, adaptability, and user-friendliness. We introduce a unified dynamic resource management application programming interface (API) that supports a wide range of HPC applications, allowing seamless integration without direct interaction with Dynamic Management of Resources (DMR). The DMR framework, evolved from the DMRlib structure, now supports various dynamic resource managers and includes the Proteo reconfiguration engine to enhance malleability strategies. This integration addresses previous limitations by allowing diverse reconfiguration methods without respawning all processes or lacking RMS support. The paper also showcases the solution's performance and coding productivity with the MPDATA (Multidimensional Positive Definite Advection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
