Benchmarking mixed-mode PETSc performance on high-performance architectures
Michael Lange, Gerard Gorman, Michele Weiland, Lawrence, Mitchell, Xiaohu Guo, James Southern

TL;DR
This paper benchmarks the performance of mixed-mode PETSc, combining shared-memory and message passing, on various high-performance architectures, demonstrating improved scalability and efficiency for scientific computations.
Contribution
It introduces OpenMP threaded functionality to PETSc, evaluates mixed-mode performance across multiple architectures, and provides insights into optimizing parallel scalability for scientific applications.
Findings
Mixed-mode PETSc achieves significant speedups over pure MPI.
Explicit load balancing improves parallel performance.
Performance varies across different HPC architectures.
Abstract
The trend towards highly parallel multi-processing is ubiquitous in all modern computer architectures, ranging from handheld devices to large-scale HPC systems; yet many applications are struggling to fully utilise the multiple levels of parallelism exposed in modern high-performance platforms. In order to realise the full potential of recent hardware advances, a mixed-mode between shared-memory programming techniques and inter-node message passing can be adopted which provides high-levels of parallelism with minimal overheads. For scientific applications this entails that not only the simulation code itself, but the whole software stack needs to evolve. In this paper, we evaluate the mixed-mode performance of PETSc, a widely used scientific library for the scalable solution of partial differential equations. We describe the addition of OpenMP threaded functionality to the library,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
