Minimizing CGYRO HPC Communication Costs in Ensembles with XGYRO by Sharing the Collisional Constant Tensor Structure
Igor Sfiligoi, Emily A. Belli, Jeff Candy

TL;DR
This paper introduces XGYRO, a tool that optimizes HPC ensemble simulations of fusion plasma by sharing the collisional constant tensor structure, significantly reducing memory and communication costs.
Contribution
XGYRO enables ensemble-based CGYRO simulations to share the collisional tensor, reducing memory use and communication overhead in high-performance computing environments.
Findings
Memory consumption per simulation is drastically reduced.
Communication overhead decreases due to shared tensor structure.
Ensemble execution improves overall simulation efficiency.
Abstract
First-principles fusion plasma simulations are both compute and memory intensive, and CGYRO is no exception. The use of many HPC nodes to fit the problem in the available memory thus results in significant communication overhead, which is hard to avoid for any single simulation. That said, most fusion studies are composed of ensembles of simulations, so we developed a new tool, named XGYRO, that executes a whole ensemble of CGYRO simulations as a single HPC job. By treating the ensemble as a unit, XGYRO can alter the global buffer distribution logic and apply optimizations that are not feasible on any single simulation, but only on the ensemble as a whole. The main saving comes from the sharing of the collisional constant tensor structure, since its values are typically identical between parameter-sweep simulations. This data structure dominates the memory consumption of CGYRO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
