TL;DR
This paper analyzes the C++ implementation of BigClam for community detection in large networks and demonstrates that parallelizing a key step with OpenMP significantly accelerates the algorithm without affecting accuracy.
Contribution
The paper identifies the bottleneck in BigClam's implementation and introduces a parallelization approach that speeds up the process by over five times.
Findings
Parallelization reduces runtime by 5.3x on large networks.
Key step of node-community assignment is the main bottleneck.
Parallel implementation preserves algorithm accuracy.
Abstract
We perform a detailed analysis of the C++ implementation of the Cluster Affiliation Model for Big Networks (BigClam) on the Stanford Network Analysis Project (SNAP). BigClam is a popular graph mining algorithm that is capable of finding overlapping communities in networks containing millions of nodes. Our analysis shows a key stage of the algorithm - determining if a node belongs to a community - dominates the runtime of the implementation, yet the computation is not parallelized. We show that by parallelizing computations across multiple threads using OpenMP we can speed up the algorithm by 5.3 times when solving large networks for communities, while preserving the integrity of the program and the result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
