Approximate kernel clustering

Subhash Khot; Assaf Naor

arXiv:0807.4626·cs.DS·December 9, 2008

Approximate kernel clustering

Subhash Khot, Assaf Naor

PDF

Open Access

TL;DR

This paper introduces a polynomial-time approximation algorithm for the kernel clustering problem, analyzes its computational complexity, and establishes UGC-based hardness thresholds, advancing understanding of clustering in machine learning.

Contribution

It provides the first constant factor approximation algorithm for kernel clustering and determines the UGC hardness threshold for specific cases, including the identity matrix.

Findings

01

Developed a polynomial-time approximation algorithm with a constant factor.

02

Established the UGC hardness threshold for the kernel clustering problem.

03

Connected the problem to a geometric conjecture influencing thresholds for identity matrices.

Abstract

In the kernel clustering problem we are given a large $n \times n$ positive semi-definite matrix $A = (a_{ij})$ with $\sum_{i, j = 1}^{n} a_{ij} = 0$ and a small $k \times k$ positive semi-definite matrix $B = (b_{ij})$ . The goal is to find a partition $S_{1}, ..., S_{k}$ of ${1, ... n}$ which maximizes the quantity $i, j = 1 \sum k ((i, j) \in S_{i} \times S_{j} \sum a_{ij}) b_{ij} .$ We study the computational complexity of this generic clustering problem which originates in the theory of machine learning. We design a constant factor polynomial time approximation algorithm for this problem, answering a question posed by Song, Smola, Gretton and Borgwardt. In some cases we manage to compute the sharp approximation threshold for this problem assuming the Unique Games Conjecture (UGC). In particular, when $B$ is the $3 \times 3$ identity matrix the UGC hardness threshold of this problem is exactly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplexity and Algorithms in Graphs · Optimization and Search Problems · Stochastic Gradient Optimization Techniques