Improving the Load Balancing Performance of Vlasiator
Ata Turk, Cevdet Aykanat, G. Vehbi Demirci, Sebastian von Alfthan,, Ilja Honkonen

TL;DR
This paper addresses load-balancing performance issues in Vlasiator, a large-scale Vlasov-hybrid simulation code, by proposing and comparing alternative partitioning schemes to improve efficiency during petascaling.
Contribution
It introduces alternative graph-partitioning schemes that reduce preprocessing overhead and improve load balance compared to hypergraph partitioning in Vlasiator.
Findings
Hypergraph partitioning is time-consuming and impacts overall performance.
Alternative schemes like ParMeTiS and PT-SCOTCH offer comparable load balancing with less overhead.
Test results show improved efficiency on BlueGene/P cluster.
Abstract
This whitepaper describes the load-balancing performance issues that are observed and tackled during the petascaling of the Vlasiator codes. Vlasiator is a Vlasov-hybrid simulation code developed in Finnish Meteorological Institute (FMI). Vlasiator models the communications associated with the spatial grid operated on as a hypergraph and partitions the grid using the parallel hypergraph partitioning scheme (PHG) of the Zoltan partitioning framework. The result of partitioning determines the distribution of grid cells to processors. It is observed that the partitioning phase takes a substantial percentage of the overall computation time. Alternative (graph-partitioning-based) schemes that perform almost as well as the hypergraph partitioning scheme and that require less preprocessing overhead and better balance are proposed and investigated. A comparison in terms of effect on running…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
