Approximate Clustering with Same-Cluster Queries

Nir Ailon; Anup Bhattacharya; Ragesh Jaiswal; Amit Kumar

arXiv:1704.01862·cs.DS·October 5, 2017·6 cites

Approximate Clustering with Same-Cluster Queries

Nir Ailon, Anup Bhattacharya, Ragesh Jaiswal, Amit Kumar

PDF

Open Access

TL;DR

This paper introduces a polynomial-time approximation algorithm for the $k$-means clustering problem using a limited number of same-cluster queries, removing the need for margin assumptions and providing bounds on query complexity.

Contribution

It extends semi-supervised clustering with same-cluster queries to achieve $(1 + ext{epsilon})$-approximation without margin assumptions, using a query complexity independent of dataset size.

Findings

01

Achieves $(1 + ext{epsilon})$-approximation for $k$-means with few queries

02

Provides a lower bound on query complexity under ETH

03

Modifies $k$-means++ to obtain constant-factor approximation

Abstract

Ashtiani et al. proposed a Semi-Supervised Active Clustering framework (SSAC), where the learner is allowed to make adaptive queries to a domain expert. The queries are of the kind "do two given points belong to the same optimal cluster?" There are many clustering contexts where such same-cluster queries are feasible. Ashtiani et al. exhibited the power of such queries by showing that any instance of the $k$ -means clustering problem, with additional margin assumption, can be solved efficiently if one is allowed $O (k^{2} lo g k + k lo g n)$ same-cluster queries. This is interesting since the $k$ -means problem, even with the margin assumption, is $NP$ -hard. In this paper, we extend the work of Ashtiani et al. to the approximation setting showing that a few of such same-cluster queries enables one to get a polynomial-time $(1 + ε)$ -approximation algorithm for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Graph Theory Research · Advanced Clustering Algorithms Research