Statistical Mechanics of Semi-Supervised Clustering in Sparse Graphs
Greg Ver Steeg, Aram Galstyan, Armen E. Allahverdyan

TL;DR
This paper provides a theoretical analysis of semi-supervised clustering in sparse graphs, showing how pairwise constraints influence detection thresholds and cluster recoverability.
Contribution
It offers a novel theoretical framework for understanding the effects of pairwise constraints on clustering in sparse graphs, extending previous unsupervised results.
Findings
Constraints shift detection thresholds at low densities.
High-density constraints suppress criticality and detection thresholds.
Adding constraints does not automatically improve clustering accuracy.
Abstract
We theoretically study semi-supervised clustering in sparse graphs in the presence of pairwise constraints on the cluster assignments of nodes. We focus on bi-cluster graphs, and study the impact of semi-supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within-cluster and between-cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine the impact of pairwise constraints on the clustering accuracy. Our results suggests that the addition of constraints does not provide automatic improvement over the unsupervised case. When the density of the constraints is sufficiently small, their only impact is to shift the detection threshold while preserving the criticality. Conversely, if the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Bayesian Methods and Mixture Models
