Speeding Up BigClam Implementation on SNAP

C. H. Bryan Liu; Benjamin Paul Chamberlain

arXiv:1712.01209·cs.SI·September 6, 2019

Speeding Up BigClam Implementation on SNAP

C. H. Bryan Liu, Benjamin Paul Chamberlain

PDF

1 Repo

TL;DR

This paper analyzes the C++ implementation of BigClam for community detection in large networks and demonstrates that parallelizing a key step with OpenMP significantly accelerates the algorithm without affecting accuracy.

Contribution

The paper identifies the bottleneck in BigClam's implementation and introduces a parallelization approach that speeds up the process by over five times.

Findings

01

Parallelization reduces runtime by 5.3x on large networks.

02

Key step of node-community assignment is the main bottleneck.

03

Parallel implementation preserves algorithm accuracy.

Abstract

We perform a detailed analysis of the C++ implementation of the Cluster Affiliation Model for Big Networks (BigClam) on the Stanford Network Analysis Project (SNAP). BigClam is a popular graph mining algorithm that is capable of finding overlapping communities in networks containing millions of nodes. Our analysis shows a key stage of the algorithm - determining if a node belongs to a community - dominates the runtime of the implementation, yet the computation is not parallelized. We show that by parallelizing computations across multiple threads using OpenMP we can speed up the algorithm by 5.3 times when solving large networks for communities, while preserving the integrity of the program and the result.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liuchbryan/snap/tree/master/contrib/ICL-bigclam_speedup
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.