Dynamic resource allocation for efficient parallel CFD simulations
G. Houzeaux, R.M. Badia, R. Borrell, D. Dosimont, J. Ejarque, M., Garcia-Gasulla, V. L\'opez

TL;DR
This paper introduces an adaptive runtime resource management method for parallel CFD simulations, optimizing resource use based on communication efficiency to improve performance and reduce waste.
Contribution
It proposes a novel elastic computing approach that dynamically adjusts resources during simulation based on communication efficiency metrics.
Findings
Resource allocation adapts in real-time to maintain efficiency
Reduces computational waste by avoiding over-provisioning
Improves overall parallel simulation performance
Abstract
CFD users of supercomputers usually resort to rule-of-thumb methods to select the number of subdomains (partitions) when relying on MPI-based parallelization. One common approach is to set a minimum number of elements or cells per subdomain, under which the parallel efficiency of the code is "known" to fall below a subjective level, say 80%. The situation is even worse when the user is not aware of the "good" practices for the given code and a huge amount of resources can thus be wasted. This work presents an elastic computing methodology to adapt at runtime the resources allocated to a simulation automatically. The criterion to control the required resources is based on a runtime measure of the communication efficiency of the execution. According to some analytical estimates, the resources are then expanded or reduced to fulfil this criterion and eventually execute an efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
