Same-Cluster Querying for Overlapping Clusters
Wasim Huleihel, Arya Mazumdar, Muriel M\'edard, and Soumyabrata Pal

TL;DR
This paper addresses the challenge of efficiently recovering overlapping clusters using minimal queries by developing algorithms that are order optimal, noise-tolerant, and validated on real-world data.
Contribution
It introduces new algorithms for overlapping cluster recovery with theoretical guarantees and practical efficiency, extending prior work from disjoint to overlapping clusters.
Findings
Algorithms are order optimal in query complexity.
Algorithms work under noise and arbitrary models.
Validated on synthetic and real-world datasets.
Abstract
Overlapping clusters are common in models of many practical data-segmentation applications. Suppose we are given elements to be clustered into possibly overlapping clusters, and an oracle that can interactively answer queries of the form "do elements and belong to the same cluster?" The goal is to recover the clusters with minimum number of such queries. This problem has been of recent interest for the case of disjoint clusters. In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries. We provide algorithmic results under both arbitrary (worst-case) and statistical modeling assumptions. Our algorithms are parameter free, efficient, and work in the presence of random noise. We also derive information-theoretic lower bounds on the number of queries needed, proving that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
MethodsTest
