Active Learning for Community Detection in Stochastic Block Models

Akshay Gadde; Eyal En Gad; Salman Avestimehr; Antonio Ortega

arXiv:1605.02372·cs.LG·November 17, 2016

Active Learning for Community Detection in Stochastic Block Models

Akshay Gadde, Eyal En Gad, Salman Avestimehr, Antonio Ortega

PDF

TL;DR

This paper explores how active learning can enable community detection in stochastic block models below the traditional threshold by querying a small, sub-linear number of node labels, and provides an efficient algorithm for this task.

Contribution

It introduces a method for community detection below the known threshold using limited label queries and offers an efficient algorithm with proven success conditions.

Findings

01

Sampling a vanishingly small fraction of nodes suffices for detection below the threshold.

02

Recovery is impossible with fewer than n^{1-D(a,b)} labels.

03

Numerical experiments validate theoretical results.

Abstract

The stochastic block model (SBM) is an important generative model for random graphs in network science and machine learning, useful for benchmarking community detection (or clustering) algorithms. The symmetric SBM generates a graph with $2 n$ nodes which cluster into two equally sized communities. Nodes connect with probability $p$ within a community and $q$ across different communities. We consider the case of $p = a ln (n) / n$ and $q = b ln (n) / n$ . In this case, it was recently shown that recovering the community membership (or label) of every node with high probability (w.h.p.) using only the graph is possible if and only if the Chernoff-Hellinger (CH) divergence $D (a, b) = (a - b)^{2} \geq 1$ . In this work, we study if, and by how much, community detection below the clustering threshold (i.e. $D (a, b) < 1$ ) is possible by querying the labels of a limited number of chosen nodes (i.e.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.