An Optimal Level-synchronous Shared-memory Parallel BFS Algorithm with Optimal parallel Prefix-sum Algorithm and its Implications for Energy Consumption
Jesmin Jahan Tithi, Yonatan Fogel, Rezaul Chowdhury

TL;DR
This paper introduces a work-efficient parallel BFS algorithm for shared-memory systems that is optimal in time, energy-efficient, lock-free, and adaptable to coprocessors, with broad applicability.
Contribution
It presents a novel level-synchronous BFS algorithm that achieves theoretical optimality in parallel time and demonstrates energy savings through core idling strategies.
Findings
Achieves theoretical lower bound on parallel BFS runtime
Reduces energy consumption by core idling during computation
Lock-free design enhances scalability and adaptability
Abstract
We present a work-efficient parallel level-synchronous Breadth First Search (BFS) algorithm for shared-memory architectures which achieves the theoretical lower bound on parallel running time. The optimality holds regardless of the shape of the graph. We also demonstrate the implication of this optimality for the energy consumption of the program empirically. The key idea is never to use more processing cores than necessary to complete the work in any computation step efficiently. We keep the rest of the cores idle to save energy and to reduce other resource contentions (e.g., bandwidth, shared caches, etc). Our BFS does not use locks and atomic instructions and is easily extendible to shared-memory coprocessors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Optimization and Search Problems
