Clustering with Non-adaptive Subset Queries

Hadley Black; Euiwoong Lee; Arya Mazumdar; Barna Saha

arXiv:2409.10908·cs.DS·April 16, 2025

Clustering with Non-adaptive Subset Queries

Hadley Black, Euiwoong Lee, Arya Mazumdar, Barna Saha

PDF

Open Access 1 Video

TL;DR

This paper introduces the first non-adaptive algorithms for clustering using subset queries, significantly reducing the number of queries needed compared to previous adaptive methods, especially for constant or balanced clusters.

Contribution

It provides novel non-adaptive algorithms for subset query-based clustering, with improved query complexity bounds and analysis for various query size restrictions and cluster balance conditions.

Findings

01

Non-adaptive algorithms with $O(n \, \log k \cdot (\log k + \log\log n)^2)$ queries.

02

Lower bounds of $\Omega(\max(n^2/s^2,n))$ queries for restricted query sizes.

03

Algorithms with $O(n \log k)$ queries for balanced clusters.

Abstract

Recovering the underlying clustering of a set $U$ of $n$ points by asking pair-wise same-cluster queries has garnered significant interest in the last decade. Given a query $S \subset U$ , $∣ S ∣ = 2$ , the oracle returns yes if the points are in the same cluster and no otherwise. For adaptive algorithms with pair-wise queries, the number of required queries is known to be $Θ (nk)$ , where $k$ is the number of clusters. However, non-adaptive schemes require $Ω (n^{2})$ queries, which matches the trivial $O (n^{2})$ upper bound attained by querying every pair of points. To break the quadratic barrier for non-adaptive queries, we study a generalization of this problem to subset queries for $∣ S ∣ > 2$ , where the oracle returns the number of clusters intersecting $S$ . Allowing for subset queries of unbounded size, $O (n)$ queries is possible with an adaptive scheme (Chakrabarty-Liao, 2024).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Clustering with Non-adaptive Subset Queries· slideslive

Taxonomy

TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Clustering Algorithms Research

MethodsSparse Evolutionary Training