Boost K-Means

Wan-Lei Zhao; Cheng-Hao Deng; Chong-Wah Ngo

arXiv:1610.02483·cs.LG·December 6, 2016·1 cites

Boost K-Means

Wan-Lei Zhao, Cheng-Hao Deng, Chong-Wah Ngo

PDF

Open Access

TL;DR

This paper introduces a novel k-means variant driven by an explicit objective function, simplifying the clustering process and achieving better local optima, with extensive testing showing superior performance across various applications.

Contribution

A new k-means variant that simplifies the clustering process using a stochastic optimization driven by an explicit objective function, improving convergence and results.

Findings

01

Achieves better local optima in clustering tasks

02

Demonstrates superior performance in document, image, and nearest neighbor clustering

03

Simplifies the k-means algorithm with a stochastic approach

Abstract

Due to its simplicity and versatility, k-means remains popular since it was proposed three decades ago. The performance of k-means has been enhanced from different perspectives over the years. Unfortunately, a good trade-off between quality and efficiency is hardly reached. In this paper, a novel k-means variant is presented. Different from most of k-means variants, the clustering procedure is driven by an explicit objective function, which is feasible for the whole l2-space. The classic egg-chicken loop in k-means has been simplified to a pure stochastic optimization procedure. The procedure of k-means becomes simpler and converges to a considerably better local optima. The effectiveness of this new variant has been studied extensively in different contexts, such as document clustering, nearest neighbor search and image clustering. Superior performance is observed across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models