Guaranteed clustering and biclustering via semidefinite programming
Brendan P. W. Ames

TL;DR
This paper introduces semidefinite programming relaxations for clustering and biclustering, providing conditions under which these methods exactly recover the true clusters in data with theoretical guarantees and empirical validation.
Contribution
It develops novel semidefinite programming formulations for clustering and biclustering with proven exact recovery guarantees under certain data conditions.
Findings
Semidefinite relaxations achieve exact cluster recovery in ideal conditions.
The approach extends to biclustering with similar theoretical guarantees.
Numerical experiments support the theoretical results.
Abstract
Identifying clusters of similar objects in data plays a significant role in a wide range of applications. As a model problem for clustering, we consider the densest k-disjoint-clique problem, whose goal is to identify the collection of k disjoint cliques of a given weighted complete graph maximizing the sum of the densities of the complete subgraphs induced by these cliques. In this paper, we establish conditions ensuring exact recovery of the densest k cliques of a given graph from the optimal solution of a particular semidefinite program. In particular, the semidefinite relaxation is exact for input graphs corresponding to data consisting of k large, distinct clusters and a smaller number of outliers. This approach also yields a semidefinite relaxation for the biclustering problem with similar recovery guarantees. Given a set of objects and a set of features exhibited by these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
