High-performance Computation of Kubo Formula with Vectorization of Batched Linear Algebra Operation
Yuta Yahagi, and Toshihiro Kato

TL;DR
This paper presents a vectorized method for computing the Kubo formula, significantly accelerating performance on vector processors by batching linear algebra operations and optimizing memory access.
Contribution
The paper introduces a novel vectorization approach for Kubo formula calculations using batched linear algebra, achieving notable speedups on vector hardware.
Findings
Vectorized implementation is 2.2 times faster than scalar baseline.
Performance is highly sensitive to memory-bank conflict avoidance.
Batched linear algebra enables efficient parallel evaluation of integration points.
Abstract
We have proposed a method to accelerate the computation of Kubo formula optimized to vector processors. The key concept is parallel evaluation of multiple integration points, enabled by batched linear algebra operations. Through benchmark comparisons between the vector-based NEC SX-Aurora TSUBASA and the scalar-based Xeon machines in node performance, we verified that the vectorized implementation was speeded up to approximately 2.2 times faster than the baseline. We have also shown that the performance improvement due to padding, indicating that avoiding the memory-bank conflict is critically important in this type of task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
