Partial recovery bounds for clustering with the relaxed $K$means
Christophe Giraud, Nicolas Verzelen

TL;DR
This paper establishes exponential decay bounds for misclassification errors of relaxed K-means in sub-Gaussian Mixture Models and Stochastic Block Models, demonstrating its effectiveness and versatility in various clustering scenarios.
Contribution
It provides new partial recovery bounds for relaxed K-means, extending its applicability and improving upon existing results in sGMM and SBM settings.
Findings
Misclassification error decays exponentially with SNR.
Relaxed K-means handles general connection probabilities in SBM.
Bounds improve upon previous results in clustering theory.
Abstract
We investigate the clustering performances of the relaxed means in the setting of sub-Gaussian Mixture Model (sGMM) and Stochastic Block Model (SBM). After identifying the appropriate signal-to-noise ratio (SNR), we prove that the misclassification error decay exponentially fast with respect to this SNR. These partial recovery bounds for the relaxed means improve upon results currently known in the sGMM setting. In the SBM setting, applying the relaxed means SDP allows to handle general connection probabilities whereas other SDPs investigated in the literature are restricted to the assortative case (where within group probabilities are larger than between group probabilities). Again, this partial recovery bound complements the state-of-the-art results. All together, these results put forward the versatility of the relaxed means.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Advanced Clustering Algorithms Research
