Exact Recovery of Mangled Clusters with Same-Cluster Queries
Marco Bressan, Nicol\`o Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

TL;DR
This paper presents an algorithm for exactly recovering arbitrarily ellipsoidal clusters in a semi-supervised setting with minimal queries, extending previous spherical cluster assumptions and demonstrating efficiency and accuracy through experiments.
Contribution
It introduces a new algorithm that relaxes spherical cluster assumptions to arbitrary ellipsoids with margin, achieving exact recovery with logarithmic query complexity.
Findings
Queries scale logarithmically with input size
Algorithm recovers clusters with zero misclassification
Experimental validation confirms effectiveness
Abstract
We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical -means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clusters with margin. This removes the assumption that the clustering is center-based (i.e., defined through an optimization problem), and includes all those cases where spherical clusters are individually transformed by any combination of rotations, axis scalings, and point deletions. We show that, even in this much more general setting, it is still possible to recover the latent clustering exactly using a number of queries that scales only logarithmically with the number of input points. More…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsOptimization and Search Problems · Data Management and Algorithms · Algorithms and Data Compression
