Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665)
Adetokunbo A. Adedoyin, Christian F. A. Negre, Jamaludin Mohd-Yusof,, Nicolas Bock, Daniel Osei-Kuffuor, Jean-Luc Fattebert, Michael E. Wall,, Anders M. N. Niklasson, Susan M. Mniszewski

TL;DR
This paper explores various optimization techniques for recursive electronic structure solvers on multi-core architectures, significantly improving their performance by micro-kernel tuning and applying these strategies to the BML library used in quantum molecular dynamics.
Contribution
It introduces a comprehensive micro-kernel optimization approach and demonstrates how these strategies enhance the performance of the BML library on modern multi-core systems.
Findings
Optimizations led to measurable performance improvements.
Memory and thread affinity optimizations increased efficiency.
Guided propagation of optimizations improved overall code performance.
Abstract
As we rapidly approach the frontiers of ultra large computing resources, software optimization is becoming of paramount interest to scientific application developers interested in efficiently leveraging all available on-Node computing capabilities and thereby improving a requisite science per watt metric. The scientific application of interest here is the Basic Math Library (BML) that provides a singular interface for linear algebra operation frequently used in the Quantum Molecular Dynamics (QMD) community. The provisioning of a singular interface indicates the presence of an abstraction layer which in-turn suggests commonalities in the code-base and therefore any optimization or tuning introduced in the core of code-base has the ability to positively affect the performance of the aforementioned library as a whole. With that in mind, we proceed with this investigation by performing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
