Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi
Mireya Paredes, Graham Riley, Mikel Lujan

TL;DR
This paper presents a new vectorized implementation of hybrid BFS on Intel Xeon Phi, achieving a 33% performance improvement for large graphs by leveraging advanced vector processing capabilities.
Contribution
The paper introduces a novel vectorization approach for hybrid BFS on Xeon Phi, addressing irregular memory access and workload imbalance issues.
Findings
33% performance improvement on one million vertices graph
Effective utilization of Xeon Phi's vector processing capabilities
Enhanced scalability of BFS algorithms
Abstract
The Breadth-First Search (BFS) algorithm is an important building block for graph analysis of large datasets. The BFS parallelisation has been shown to be challenging because of its inherent characteristics, including irregular memory access patterns, data dependencies and workload imbalance, that limit its scalability. We investigate the optimisation and vectorisation of the hybrid BFS (a combination of top-down and bottom-up approaches for BFS) on the Xeon Phi, which has advanced vector processing capabilities. The results show that our new implementation improves by 33\%, for a one million vertices graph, compared to the state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Parallel Computing and Optimization Techniques · Algorithms and Data Compression
See pages 1-last of cameraready.pdf
