Probabilistic community detection with unknown number of communities
Junxian Geng, Anirban Bhattacharya, Debdeep Pati

TL;DR
This paper introduces a Bayesian nonparametric framework for community detection in networks that simultaneously estimates the number of communities and their structure, improving accuracy over existing methods.
Contribution
It proposes a novel probabilistic approach with an efficient MCMC algorithm that avoids reversible jump techniques, and provides theoretical risk bounds for unknown community counts.
Findings
Outperforms existing algorithms on synthetic and real datasets
Provides non-asymptotic Bayes risk bounds for community estimation
Develops concentration results for functions of Bernoulli variables
Abstract
A fundamental problem in network analysis is clustering the nodes into groups which share a similar connectivity pattern. Existing algorithms for community detection assume the knowledge of the number of clusters or estimate it a priori using various selection criteria and subsequently estimate the community structure. Ignoring the uncertainty in the first stage may lead to erroneous clustering, particularly when the community structure is vague. We instead propose a coherent probabilistic framework for simultaneous estimation of the number of communities and the community structure, adapting recently developed Bayesian nonparametric techniques to network models. An efficient Markov chain Monte Carlo (MCMC) algorithm is proposed which obviates the need to perform reversible jump MCMC on the number of clusters. The methodology is shown to outperform recently developed community detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
