Two provably consistent divide and conquer clustering algorithms for large networks
Soumendu Sundar Mukherjee, Purnamrita Sarkar, and Peter J. Bickel

TL;DR
This paper introduces two divide-and-conquer clustering algorithms for large networks that significantly reduce computational costs while maintaining or improving accuracy, and are inherently parallelizable.
Contribution
The paper presents two novel divide-and-conquer algorithms for community detection that are provably consistent and scalable to large networks, outperforming traditional methods.
Findings
Algorithms are computationally efficient and scalable.
Maintain or improve clustering accuracy.
Validated through extensive simulations and real-data analysis.
Abstract
In this article, we advance divide-and-conquer strategies for solving the community detection problem in networks. We propose two algorithms which perform clustering on a number of small subgraphs and finally patches the results into a single clustering. The main advantage of these algorithms is that they bring down significantly the computational cost of traditional algorithms, including spectral clustering, semi-definite programs, modularity based methods, likelihood based methods etc., without losing on accuracy and even improving accuracy at times. These algorithms are also, by nature, parallelizable. Thus, exploiting the facts that most traditional algorithms are accurate and the corresponding optimization problems are much simpler in small problems, our divide-and-conquer methods provide an omnibus recipe for scaling traditional algorithms up to large networks. We prove…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
