Comparison of HPC Architectures for Computing All-Pairs Shortest Paths.   Intel Xeon Phi KNL vs NVIDIA Pascal

Manuel Costanzo; Enzo Rucci; Ulises Costi; Franco Chichizola; and Marcelo Naiouf

arXiv:2105.07298·cs.DC·May 18, 2021

Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal

Manuel Costanzo, Enzo Rucci, Ulises Costi, Franco Chichizola, and Marcelo Naiouf

PDF

TL;DR

This paper compares the performance and energy efficiency of NVIDIA Pascal GPUs and Intel Xeon Phi KNL processors for the Floyd-Warshall algorithm, revealing comparable results in single-precision but Xeon Phi's superiority in double-precision.

Contribution

It provides a detailed comparison of two major HPC architectures using a representative graph algorithm, highlighting their relative strengths in energy efficiency and performance.

Findings

01

Xeon Phi outperforms in double-precision computations.

02

Performance and energy efficiency are comparable in single-precision.

03

The study offers insights into optimizing HPC systems with accelerators.

Abstract

Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared. Selecting the Floyd-Warshall algorithm as a representative case of graph and memory-bound applications, optimized implementations were developed to analyze and compare performance and energy efficiency on both devices. As it was expected, Xeon Phi showed superior when considering double-precision data. However, contrary to what was considered in our preliminary analysis, it was found that the performance and energy efficiency of both devices were comparable using single-precision datatype.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.