Boost K-Means
Wan-Lei Zhao, Cheng-Hao Deng, Chong-Wah Ngo

TL;DR
This paper introduces a novel k-means variant driven by an explicit objective function, simplifying the clustering process and achieving better local optima, with extensive testing showing superior performance across various applications.
Contribution
A new k-means variant that simplifies the clustering process using a stochastic optimization driven by an explicit objective function, improving convergence and results.
Findings
Achieves better local optima in clustering tasks
Demonstrates superior performance in document, image, and nearest neighbor clustering
Simplifies the k-means algorithm with a stochastic approach
Abstract
Due to its simplicity and versatility, k-means remains popular since it was proposed three decades ago. The performance of k-means has been enhanced from different perspectives over the years. Unfortunately, a good trade-off between quality and efficiency is hardly reached. In this paper, a novel k-means variant is presented. Different from most of k-means variants, the clustering procedure is driven by an explicit objective function, which is feasible for the whole l2-space. The classic egg-chicken loop in k-means has been simplified to a pure stochastic optimization procedure. The procedure of k-means becomes simpler and converges to a considerably better local optima. The effectiveness of this new variant has been studied extensively in different contexts, such as document clustering, nearest neighbor search and image clustering. Superior performance is observed across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
