Clustering with Non-adaptive Subset Queries
Hadley Black, Euiwoong Lee, Arya Mazumdar, Barna Saha

TL;DR
This paper introduces the first non-adaptive algorithms for clustering using subset queries, significantly reducing the number of queries needed compared to previous adaptive methods, especially for constant or balanced clusters.
Contribution
It provides novel non-adaptive algorithms for subset query-based clustering, with improved query complexity bounds and analysis for various query size restrictions and cluster balance conditions.
Findings
Non-adaptive algorithms with $O(n \, \log k \cdot (\log k + \log\log n)^2)$ queries.
Lower bounds of $\Omega(\max(n^2/s^2,n))$ queries for restricted query sizes.
Algorithms with $O(n \log k)$ queries for balanced clusters.
Abstract
Recovering the underlying clustering of a set of points by asking pair-wise same-cluster queries has garnered significant interest in the last decade. Given a query , , the oracle returns yes if the points are in the same cluster and no otherwise. For adaptive algorithms with pair-wise queries, the number of required queries is known to be , where is the number of clusters. However, non-adaptive schemes require queries, which matches the trivial upper bound attained by querying every pair of points. To break the quadratic barrier for non-adaptive queries, we study a generalization of this problem to subset queries for , where the oracle returns the number of clusters intersecting . Allowing for subset queries of unbounded size, queries is possible with an adaptive scheme (Chakrabarty-Liao, 2024).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Clustering Algorithms Research
MethodsSparse Evolutionary Training
