Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal
Manuel Costanzo, Enzo Rucci, Ulises Costi, Franco Chichizola, and Marcelo Naiouf

TL;DR
This paper compares the performance and energy efficiency of NVIDIA Pascal GPUs and Intel Xeon Phi KNL processors for the Floyd-Warshall algorithm, revealing comparable results in single-precision but Xeon Phi's superiority in double-precision.
Contribution
It provides a detailed comparison of two major HPC architectures using a representative graph algorithm, highlighting their relative strengths in energy efficiency and performance.
Findings
Xeon Phi outperforms in double-precision computations.
Performance and energy efficiency are comparable in single-precision.
The study offers insights into optimizing HPC systems with accelerators.
Abstract
Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared. Selecting the Floyd-Warshall algorithm as a representative case of graph and memory-bound applications, optimized implementations were developed to analyze and compare performance and energy efficiency on both devices. As it was expected, Xeon Phi showed superior when considering double-precision data. However, contrary to what was considered in our preliminary analysis, it was found that the performance and energy efficiency of both devices were comparable using single-precision datatype.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
