Nearly Optimal Clustering Risk Bounds for Kernel K-Means

Yong Liu; Lizhong Ding; Weiping Wang

arXiv:2003.03888·cs.LG·May 15, 2020·1 cites

Nearly Optimal Clustering Risk Bounds for Kernel K-Means

Yong Liu, Lizhong Ding, Weiping Wang

PDF

Open Access

TL;DR

This paper establishes nearly optimal statistical bounds for kernel k-means clustering, including analysis of approximation effects, significantly advancing theoretical understanding in clustering risk analysis.

Contribution

It provides the first sharp excess clustering risk bounds for kernel and approximate kernel k-means, improving prior theoretical results.

Findings

01

Achieves nearly optimal excess clustering risk bounds.

02

Shows Nyström kernel k-means with ext{(} ext{)}} landmark points matches exact kernel k-means accuracy.

03

Provides theoretical analysis of computational approximation effects.

Abstract

In this paper, we study the statistical properties of kernel $k$ -means and obtain a nearly optimal excess clustering risk bound, substantially improving the state-of-art bounds in the existing clustering risk analyses. We further analyze the statistical effect of computational approximations of the Nystr\"{o}m kernel $k$ -means, and prove that it achieves the same statistical accuracy as the exact kernel $k$ -means considering only $Ω (nk)$ Nystr\"{o}m landmark points. To the best of our knowledge, such sharp excess clustering risk bounds for kernel (or approximate kernel) $k$ -means have never been proposed before.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques