Active Learning for Community Detection in Stochastic Block Models
Akshay Gadde, Eyal En Gad, Salman Avestimehr, Antonio Ortega

TL;DR
This paper explores how active learning can enable community detection in stochastic block models below the traditional threshold by querying a small, sub-linear number of node labels, and provides an efficient algorithm for this task.
Contribution
It introduces a method for community detection below the known threshold using limited label queries and offers an efficient algorithm with proven success conditions.
Findings
Sampling a vanishingly small fraction of nodes suffices for detection below the threshold.
Recovery is impossible with fewer than n^{1-D(a,b)} labels.
Numerical experiments validate theoretical results.
Abstract
The stochastic block model (SBM) is an important generative model for random graphs in network science and machine learning, useful for benchmarking community detection (or clustering) algorithms. The symmetric SBM generates a graph with nodes which cluster into two equally sized communities. Nodes connect with probability within a community and across different communities. We consider the case of and . In this case, it was recently shown that recovering the community membership (or label) of every node with high probability (w.h.p.) using only the graph is possible if and only if the Chernoff-Hellinger (CH) divergence . In this work, we study if, and by how much, community detection below the clustering threshold (i.e. ) is possible by querying the labels of a limited number of chosen nodes (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
