TL;DR
This paper introduces a novel fine-grained parallelization of cycle enumeration algorithms, achieving near-linear scaling and significantly outperforming coarse-grained methods on multi-core CPU clusters.
Contribution
First to parallelize Johnson and Read-Tarjan cycle enumeration algorithms in a fine-grained manner with experimental demonstration of linear performance scaling.
Findings
Fine-grained parallel algorithms are an order of magnitude faster than coarse-grained ones.
Linear performance scaling is achieved through dynamic task scheduling.
On 256 CPU cores, the algorithms are 260 times faster than serial methods.
Abstract
Enumerating simple cycles has important applications in computational biology, network science, and financial crime analysis. In this work, we focus on parallelising the state-of-the-art simple cycle enumeration algorithms by Johnson and Read-Tarjan along with their applications to temporal graphs. To our knowledge, we are the first ones to parallelise these two algorithms in a fine-grained manner. We are also the first to demonstrate experimentally a linear performance scaling. Such a scaling is made possible by our decomposition of long sequential searches into fine-grained tasks, which are then dynamically scheduled across CPU cores, enabling an optimal load balancing. Furthermore, we show that coarse-grained parallel versions of the Johnson and the Read-Tarjan algorithms that exploit edge- or vertex-level parallelism are not scalable. On a cluster of four multi-core CPUs with …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
