Porting of the DBCSR library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi systems
Iain Bethune, Andeas Gloess, Juerg Hutter, Alfio Lazzaro, Hans Pabst,, Fiona Reid

TL;DR
This paper evaluates the performance of the DBCSR library for sparse matrix multiplications on Intel Xeon Phi KNL systems, comparing it with CPU and GPU-based systems to understand its efficiency and scalability.
Contribution
It provides a detailed performance comparison of DBCSR on KNL systems versus CPU and GPU systems, highlighting its efficiency and potential for electronic structure simulations.
Findings
DBCSR on KNL is 11%-14% slower than on GPU-accelerated systems.
On CPU systems, KNL is up to 24% faster.
Performance varies depending on hardware configurations and system architecture.
Abstract
Multiplication of two sparse matrices is a key operation in the simulation of the electronic structure of systems containing thousands of atoms and electrons. The highly optimized sparse linear algebra library DBCSR (Distributed Block Compressed Sparse Row) has been specifically designed to efficiently perform such sparse matrix-matrix multiplications. This library is the basic building block for linear scaling electronic structure theory and low scaling correlated methods in CP2K. It is parallelized using MPI and OpenMP, and can exploit GPU accelerators by means of CUDA. We describe a performance comparison of DBCSR on systems with Intel Xeon Phi Knights Landing (KNL) processors, with respect to systems with Intel Xeon CPUs (including systems with GPUs). We find that the DBCSR on Cray XC40 KNL-based systems is 11%-14% slower than on a hybrid Cray XC50 with Nvidia P100 cards, at the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
