On Margin-Based Cluster Recovery with Oracle Queries
Marco Bressan, Nicol\`o Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

TL;DR
This paper introduces a margin-based framework for active cluster recovery using oracle queries, achieving near-optimal query complexity across various settings including Euclidean and pseudometric spaces.
Contribution
It proposes a general notion of margin that unifies previous concepts and designs algorithms that recover all clusters with logarithmic query complexity in diverse scenarios.
Findings
Algorithms recover all clusters with O(log n) queries.
Euclidean convex cluster recovery improves previous methods by a factor of Θ(m^m).
Margin conditions are shown to be equivalent to recoverability in many concept classes.
Abstract
We study an active cluster recovery problem where, given a set of points and an oracle answering queries like "are these two points in the same cluster?", the task is to recover exactly all clusters using as few queries as possible. We begin by introducing a simple but general notion of margin between clusters that captures, as special cases, the margins used in previous work, the classic SVM margin, and standard notions of stability for center-based clusterings. Then, under our margin assumptions we design algorithms that, in a variety of settings, recover all clusters exactly using only queries. For the Euclidean case, , we give an algorithm that recovers arbitrary convex clusters, in polynomial time, and with a number of queries that is lower than the best existing algorithm by factors. For general pseudometric spaces, where clusters might…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComplexity and Algorithms in Graphs · Optimization and Search Problems · Machine Learning and Algorithms
MethodsSupport Vector Machine
